Google announced on Thursday that its Translate platform was expanded with 110 new languages added for users thanks to the use of advanced artificial intelligence (AI) models.
The expansion is the largest that Google Translate has ever undergone – a feat made possible through the tech giant's use of the PaLM 2 large language model (LLM).
"From Cantonese to Q'eqchi', these new languages represent more than 614 million speakers, opening up translations for around 8% of the world's population," Isaac Caswell, senior software engineer for Google Translate, wrote in a release.
"Some are major world languages with over 100 million speakers," Caswell noted. "Others are spoken by small communities of Indigenous people, and a few have almost no native speakers but active revitalization efforts. About a quarter of the new languages come from Africa, representing our largest expansion of African languages to date, including Fon, Kikongo, Luo, Ga, Swati, Venda and Wolof."
Among the notable additions to Google Translate that were flagged in the company's announcement was Cantonese, which Caswell said has "long been one of the most requested languages" for the tool but was challenging to add because it often overlaps with Mandarin in writing, which made it "tricky to find data and train models."
Shahmukhi, a variety of Punjabi that's the most spoken language in Pakistan, was added along with Afar – a tonal language spoken in Djibouti, Eritrea and Ethiopia which the announcement noted had the most volunteer community contributions.
Manx, the Celtic language of the Isle of Man, was also added after it nearly went extinct with the death of its last native speaker in 1974. A revival movement on the island has resulted in there now being thousands of speakers.
The lingua franca of Papua New Guinea, Tok Pisin, was added to Google Translate and Caswell noted that due to its status as an English-based creole, app users who are English speakers should try translating into Tok Pisin because they "might be able to make out the meaning!"
Google said it's planning to add more languages to Translate over time as it seeks to meet its previously-announced 1,000 Languages Initiative – a commitment the company made to build AI models supporting the world's 1,000 most-spoken languages. As AI technology like PaLM 2 continues to advance, that process will get even faster, the company said.
"PaLM 2 was a key piece to the puzzle, helping Translate more efficiently learn languages that are closely related to each other, including languages close to Hindi, like Awadhi and Mardwadi, and French creoles like Seychellois Creole and Mauritian Creole," Caswell wrote.
"As technology advances, and as we continue to partner with expert linguists and native speakers, we'll support even more language varieties and spelling conventions over time," he added.