AudioCraft：最先进的语音和文本翻译的基础模型（meta）

2024年7月2日单位

郝彦飞

Seamless is a family of AI models that enable more natural and authentic communication across languages. SeamlessM4T is a massive multilingual multimodal machine translation model supporting around 100 languages. SeamlessM4T serves as foundation for SeamlessExpressive, a model that preserves elements of prosody and voice style across languages and SeamlessStreaming, a model supporting simultaneous translation and streaming ASR for around 100 languages. SeamlessExpressive and SeamlessStreaming are combined into Seamless, a unified model featuring multilinguality, real-time and expressive translations.

Seamless 是一系列 AI 模型，可实现更自然、更真实的跨语言交流。 SeamlessM4T 是一个大规模的多语言多模式机器翻译模型，支持大约 100 种语言。 SeamlessM4T 是 SeamlessExpressive 和 SeamlessStreaming 的基础，SeamlessExpressive 是一个保留跨语言韵律和语音风格元素的模型，而 SeamlessStreaming 是一个支持约 100 种语言的同声翻译和流式 ASR 的模型。 SeamlessExpressive 和 SeamlessStreaming 结合成 Seamless，一个具有多语言、实时和富有表现力翻译的统一模型。

SeamlessM4T models support the tasks of:

SeamlessM4T 模型支持以下任务：

Speech-to-speech translation (S2ST)
语音到语音翻译 (S2ST)
Speech-to-text translation (S2TT)
语音到文本翻译 (S2TT)
Text-to-speech translation (T2ST)
文本到语音翻译 (T2ST)
Text-to-text translation (T2TT)
文本到文本翻译 (T2TT)
Automatic speech recognition (ASR)
自动语音识别 (ASR)

源码地址：https://github.com/facebookresearch/seamless_communication

在 AI项目

# AI语音 meta MIT开源

大模型

车辆网

定制化服务

AudioCraft：最先进的语音和文本翻译的基础模型（meta）