Seamless is a family of AI models that enable more natural and authentic communication across languages. SeamlessM4T is a massive multilingual multimodal machine translation model supporting around 100 languages. SeamlessM4T serves as foundation for SeamlessExpressive, a model that preserves elements of prosody and voice style across languages and SeamlessStreaming, a model supporting simultaneous translation and streaming ASR for around 100 languages. SeamlessExpressive and SeamlessStreaming are combined into Seamless, a unified model featuring multilinguality, real-time and expressive translations.
Seamless 是一系列 AI 模型,可实现更自然、更真实的跨语言交流。 SeamlessM4T 是一个大规模的多语言多模式机器翻译模型,支持大约 100 种语言。 SeamlessM4T 是 SeamlessExpressive 和 SeamlessStreaming 的基础,SeamlessExpressive 是一个保留跨语言韵律和语音风格元素的模型,而 SeamlessStreaming 是一个支持约 100 种语言的同声翻译和流式 ASR 的模型。 SeamlessExpressive 和 SeamlessStreaming 结合成 Seamless,一个具有多语言、实时和富有表现力翻译的统一模型。
SeamlessM4T models support the tasks of:
SeamlessM4T 模型支持以下任务:
- Speech-to-speech translation (S2ST)
语音到语音翻译 (S2ST) - Speech-to-text translation (S2TT)
语音到文本翻译 (S2TT) - Text-to-speech translation (T2ST)
文本到语音翻译 (T2ST) - Text-to-text translation (T2TT)
文本到文本翻译 (T2TT) - Automatic speech recognition (ASR)
自动语音识别 (ASR)
源码地址:https://github.com/facebookresearch/seamless_communication