Meta’s Alexandr Wang unveils new open-source AI model that understands over 1,600 languages

The Omnilingual ASR system addresses a long-standing issue in the AI industry where most speech recognition platforms focus heavily on high-resource, widely spoken languages, thus leaving other linguistic communities without reliable tools.

meta ai
Omnilingual ASR has been released as an open-source tool, encouraging Meta to attract researchers, developers, and organisations to use the platform. (Image: Unsplash)

Facebook parent Meta has announced a major upgrade to its AI efforts with the release of Omnilingual ASR, a new open-source automatic speech recognition (ASR) system. The company claims that this system can understand and transcribe over 1,600 spoken languages, including approximately 500 low-resource languages that have never received dedicated support from AI-based transcription tools.

Developed by Meta’s Fundamental AI Research (FAIR) team, the ASR system aims to drastically widen access to digital speech technology across the globe. Alexandr Wang, Meta’s AI chief, announced the milestone on X (formerly Twitter), stating, “Meta Omnilingual ASR expands speech recognition to 1,600+ languages, including 500 never before supported, as a major step towards truly universal AI. We are open-sourcing a full suite of models and a dataset.”

Meta’s powerful speech recognition AI model is open for everyone

The Omnilingual ASR system addresses a long-standing issue in the AI industry where most speech recognition platforms focus heavily on high-resource, widely spoken languages, thus leaving other linguistic communities without reliable tools. With support for over 1,600 languages, including those with limited digital documentation, Meta hopes to close this digital linguistic gap.

At the core of the system is the Omnilingual wav2vec 2.0 model, a massive multilingual speech model scaled to seven billion parameters. This model was trained on public datasets combined with speech recordings sourced from communities globally, with partnerships with Mozilla Foundation’s Common Voice and Lanfrica. The involvement of local speakers ensured the dataset was representative of real-world accents, dialects, and speech patterns.

How effective is Meta’s Omnilingual ASR systems?

Meta acknowledges a variation in accuracy depending on the chosen language. Internal data shows that over 95% of high and medium-resource languages achieved a character error rate below 10%. However, only 36% of low-resource languages met that same benchmark. This highlighted the continued challenge of building accurate AI for under-documented languages.

However, Omnilingual ASR has been released as an open-source tool, encouraging Meta to attract researchers, developers, and organisations to use the platform, allowing them to engage the resource in crucial tasks like accessibility, translation, and communication. 

This article was first uploaded on November eleven, twenty twenty-five, at ten minutes past five in the evening.

/