SeamlessM4T

SeamlessM4T is designed to provide high quality translation, allowing people from different linguistic communities to communicate effortlessly through speech and text.

Supported Languages

You can try entering any of the following language codes:

  • Afrikaans (afr)
  • Amharic (amh)
  • Modern Standard Arabic (arb)
  • Moroccan Arabic (ary)
  • Egyptian Arabic (arz)
  • Assamese (asm)
  • Asturian (ast)
  • North Azerbaijani (azj)
  • Belarusian (bel)
  • Bengali (ben)
  • Bosnian (bos)
  • Bulgarian (bul)
  • Catalan (cat)
  • Cebuano (ceb)
  • Czech (ces)
  • Central Kurdish (ckb)
  • Mandarin Chinese (cmn)
  • Mandarin Chinese (cmn_Hant)
  • Welsh (cym)
  • Danish (dan)
  • German (deu)
  • Greek (ell)
  • English (eng)
  • Estonian (est)
  • Basque (eus)
  • Finnish (fin)
  • French (fra)
  • West Central Oromo (gaz)
  • Irish (gle)
  • Galician (glg)
  • Gujarati (guj)
  • Hebrew (heb)
  • Hindi (hin)
  • Croatian (hrv)
  • Hungarian (hun)
  • Armenian (hye)
  • Igbo (ibo)
  • Indonesian (ind)
  • Icelandic (isl)
  • Italian (ita)
  • Javanese (jav)
  • Japanese (jpn)
  • Kamba (kam)
  • Kannada (kan)
  • Georgian (kat)
  • Kazakh (kaz)
  • Kabuverdianu (kea)
  • Halh Mongolian (khk)
  • Khmer (khm)
  • Kyrgyz (kir)
  • Korean (kor)
  • Lao (lao)
  • Lithuanian (lit)
  • Luxembourgish (ltz)
  • Ganda (lug)
  • Luo (luo)
  • Standard Latvian (lvs)
  • Maithili (mai)
  • Malayalam (mal)
  • Marathi (mar)
  • Macedonian (mkd)
  • Maltese (mlt)
  • Meitei (mni)
  • Burmese (mya)
  • Dutch (nld)
  • Norwegian Nynorsk (nno)
  • Norwegian Bokmål (nob)
  • Nepali (npi)
  • Nyanja (nya)
  • Occitan (oci)
  • Odia (ory)
  • Punjabi (pan)
  • Southern Pashto (pbt)
  • Western Persian (pes)
  • Polish (pol)
  • Portuguese (por)
  • Romanian (ron)
  • Russian (rus)
  • Slovak (slk)
  • Slovenian (slv)
  • Shona (sna)
  • Sindhi (snd)
  • Somali (som)
  • Spanish (spa)
  • Serbian (srp)
  • Swedish (swe)
  • Swahili (swh)
  • Tamil (tam)
  • Telugu (tel)
  • Tajik (tgk)
  • Tagalog (tgl)
  • Thai (tha)
  • Turkish (tur)
  • Ukrainian (ukr)
  • Urdu (urd)
  • Northern Uzbek (uzn)
  • Vietnamese (vie)
  • Xhosa (xho)
  • Yoruba (yor)
  • Cantonese (yue)
  • Colloquial Malay (zlm)
  • Standard Malay (zsm)
  • Zulu (zul)

SeamlessM4T Coverage

  • 101 languages for speech input.
  • 96 Languages for text input/output.
  • 35 languages for speech output.

This unified model enables multiple tasks without relying on multiple separate models:

  • Speech-to-speech translation (S2ST)
  • Speech-to-text translation (S2TT)
  • Text-to-speech translation (T2ST)
  • Text-to-text translation (T2TT)
  • Automatic speech recognition (ASR)

Note that seamlessM4T-medium supports 200 languages and is based on NLLB-200 (see full list in asset card)