• Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
2026 / 03 / 17 13:49 Tuesday
  • Login
mashdigi-Technology, new products, interesting news, trends
  • Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
No Result
View All Result
  • Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
No Result
View All Result
mashdigi-Technology, new products, interesting news, trends
No Result
View All Result
This is an advertisement.
Home Market dynamics

Meta's open-source Omnilingual ASR speech foundation model supports over 1600 languages ​​and can import speech encoders with 70 billion parameters.

Author: Mash Yang
2025-11-11
in Market dynamics, Life, network, software
A A
0
Share to FacebookShare on TwitterShare to LINE

The Meta AI FAIR team recently announced its latest major achievements in the field of Automatic Speech Recognition (ASR):"Omnilingual ASR"This is a model kit that claims to provide automatic speech recognition capabilities for more than 1600 languages, and its scale and quality are at an industry-leading level.

Meta's open-source Omnilingual ASR speech foundation model supports over 1600 languages ​​and can import speech encoders with 70 billion parameters.

Meta emphasizes that this move will address the problem of ASR technology and resources being overly concentrated in a few high-resource languages ​​through a universal transcription system, allowing high-quality speech-to-text technology to benefit underrepresented language communities and bridging the digital divide.

Import wav2vec 2.0 with 70 billion parameters, and synchronize open source models and datasets.

Alongside this announcement, Meta also open-sourced a series of key related assets (all released under the Apache 2.0 license), including:

• Omnilingual ASR model family: Available in a variety of sizes, from a lightweight version with 3 million parameters designed for low-power devices to a 70 billion parameter model offering top-level accuracy.

• Omnilingual wav2vec 2.0 basic model: A large-scale multilingual speech representation model, expanded to 70 billion parameters, can serve as a foundation for other speech tasks besides ASR.

• Omnilingual ASR Corpus: A large dataset (CC-BY license) containing transcribed speech in 350 under-served languages.

The LLM-ASR architecture achieves a state-of-the-art model with a language error rate of less than 10% in 78% of cases.

To address the technical bottlenecks of ASR expansion, Omnilingual ASR introduces two architectures. First, the team expanded its wav2vec 2.0 speech encoder to 70 billion parameters for the first time, generating rich multilingual semantic representations from a large amount of untranscribed speech.

Next, the team built two decoder variants: one is the traditional CTC (Connectionist Temporal Classification); the other utilizes the Transformer decoder and is called "LLM-ASR".

According to a research paper published in Meta, a 70 billion-parameter system using the LLM-ASR method achieved state-of-the-art performance (SOTA) on more than 1600 languages, with a character error rate (CER) of less than 10% in 78% of the languages.

Meta's open-source Omnilingual ASR speech foundation model supports over 1600 languages ​​and can import speech encoders with 70 billion parameters.

Introducing the concept of "Bring Your Own Language"

One of the biggest breakthroughs of this Omnilingual ASR is that it changes the traditional paradigm of adding new languages ​​and introduces the concept of "Bring Your Own Language". This is thanks to its LLM-inspired system, which incorporates powerful "in-context learning" capabilities.

This is an advertisement.

In practice, this means that users of a currently unsupported language only need to provide a few pairs of audio-text samples, and AI can obtain usable transcription quality through these contextual paradigms without large-scale model fine-tuning, expertise, or high-level computing resources. This is seen as enabling "community-driven" language expansion.

Working with local partners to collect 350 low-resource languages

To cover languages ​​with virtually no digital footprint, the team not only integrated publicly available resources but also collaborated with local organizations (such as the Mozilla Foundation).Common Voice(e.g., Lanfrica/NaijaVoices), directly collaborating with local communities to recruit and compensate native speakers for providing voice recordings.

The corpus collected in this commissioned effort, released as the Omnilingual ASR Corpus, is one of the largest datasets currently available for ultra-low-resource natural speech ASR.

Currently, the relevant models, datasets, transcription tool demos, and language exploration demos have been released to the public through channels such as GitHub, Hugging Face, and Meta AI.

Tags: LLMMetaAI goalOmnilingual ASR
ShareTweetShare
Mash Yang

Mash Yang

Founder and editor of mashdigi.com, and student of technology journalism.

Leave a Reply Cancel Reply

The email address that must be filled in to post a message will not be made public. Required fields are marked as *

This site uses Akismet service to reduce spam.Learn more about how Akismet processes website visitor comments.

Translation (Tanslate)

Recent updates:

Intel Xeon 6 processors are now integrated into NVIDIA's DGX Rubin NVL8 rack system, becoming the "command center" for the era of AI inference.

Intel Xeon 6 processors are now integrated into NVIDIA's DGX Rubin NVL8 rack system, becoming the "command center" for the era of AI inference.

2026-03-17
Samsung unveils Galaxy S26 series, aiming for 8 million AI devices: a comprehensive evolution in performance, camera, and multi-model AI.

Samsung's MX division sounds warning of its first-ever loss! Record-breaking pre-orders for the Galaxy S26 series couldn't withstand the blow of "memory inflation".

2026-03-17
Microsoft opens Windows PC testing of its "Copilot for Gaming" AI service to help players solve problems and explore games in real time

Xbox will bring the Gaming Copilot AI assistant to its current-generation consoles by 2026.

2026-03-17
mashdigi-Technology, new products, interesting news, trends

Copyright © 2017 mashdigi.com

  • About mashdigi.com
  • Place ads
  • Contact mashdigi.com

Follow us

Welcome back!

Login to your account below

Forgotten Password?

Retrieve your password

Hãy nhập tên người dùng hoặc địa chỉ email để mở mật khẩu

Log In
No Result
View All Result
  • About mashdigi.com
  • Place ads
  • Contact mashdigi.com

Copyright © 2017 mashdigi.com