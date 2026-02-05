Google, partnering with top African universities and researchers, has unveiled WAXAL—a massive, open-source speech dataset for 21 sub-Saharan African languages.

The initiative was launched to shatter the language barrier locking hundreds of millions out of the digital world. This first-of-its-kind resource directly tackles a stark digital divide.

While voice assistants and speech tech are commonplace globally, over 2,000 African languages have been left behind due to a lack of usable data. WAXAL provides the missing foundation: 1,250 hours of transcribed, natural conversation and studio-quality recordings to build inclusive AI—from educational tools to voice-enabled services.

The Head of Google Research Africa, Aisha WalcottBryant, said: “The ultimate impact is the empowerment of people in Africa. This lets students, researchers, and entrepreneurs build technology on their own terms, in their own languages.” She said the project was built by and for the African community.

“Led by institutions like Makerere University (Uganda) and the University of Ghana, with Google’s support, it ensures the tech reflects local needs and expertise. “The dataset covers languages including Hausa, Yoruba, Igbo, Swahili, Luganda, Shona, and Malagasy, aiming to catalyze a wave of innovation that connects over 100 million speakers to the digital economy in their native tongue.”