Google Translate is expanding its language capabilities significantly, adding 110 new languages, including Manx, in its largest single expansion ever. This brings the total number of languages supported by the translation tool to 243, nearly doubling its previous count.
The expansion is driven by PaLM 2, the latest iteration of Google’s Pathways Language Model introduced in 2022 and enhanced with version 2 in May 2023.
Google Translate has steadily broadened its language repertoire over the years. In 2008, for instance, it added Czech, a crucial addition for many, including this writer, who moved to a Czech-speaking region a decade ago. Recently, the focus has also extended to languages like Manx, spoken on the Isle of Man.
This expansion, similar to a more modest increase of 24 languages in 2022, utilizes Google’s Zero Shot machine translation method. Since 2016, Google Translate has employed neural network models for translation, with zero-resource training enabling the models to translate languages without exact one-to-one matching texts in the training database.
This development highlights a practical application of large language models (LLMs), which some present as AI. LLMs operate on neural networks, and contrary to marketing claims about “AI accelerator chips,” these are primarily specialized co-processors for faster tensor mathematics computations.
Machine translation plays a crucial role in preserving and revitalizing minority languages. A notable example is Manx, which has seen a revival over the past few decades. The last native speaker, Edward “Ned” Maddrell, passed away in 1974. However, efforts to document the language through recordings and videos have preserved it. Today, there is a new generation of native Manx speakers, with children being raised by adults who learned the language as a second language. Additionally, the establishment of Bunscoill Ghaelgagh, a Manx language primary school, has further contributed to the language’s resurgence.
Google’s recent addition of 110 languages, including Manx, marks its largest expansion ever of Google Translate. This brings the total number of languages supported by the translation tool to 243, nearly doubling its previous count.
The expansion is powered by PaLM 2, the latest release of Google’s Pathways Language Model introduced in 2022 and improved with version 2 in May 2023. Google Translate has been gradually expanding its language capabilities for years, a journey that began with the addition of languages like Czech back in 2008, catering to diverse linguistic needs worldwide.
The comprehensive list of languages now supported by Google Translate includes Abkhaz, Acehnese, Acholi, Afar, Afrikaans, Albanian, Alur, Amharic, Arabic, Armenian, Assamese, Avar, Awadhi, Aymara, Azerbaijani, Balinese, Baluchi, Bambara, Baoulé, Bashkir, Basque, Batak Karo, Batak Simalungun, Batak Toba, Belarusian, Bemba, Bengali, Betawi, Bhojpuri, Bikol, Bosnian, Breton, Bulgarian, Buryat, Cantonese, Catalan, Cebuano, Chamorro, Chechen, Chichewa, Chinese (Simplified), Chinese (Traditional), Chuukese, Chuvash, Corsican, Crimean Tatar, Croatian, Czech, Danish, Dari, Dhivehi, Dinka, Dogri, Dombe, Dutch, Dyula, Dzongkha, English, Esperanto, Estonian, Ewe, Faroese, Fijian, Filipino, Finnish, Fon, French, Frisian, Friulian, Fulani, Ga, Galician, Georgian, German, Greek, Guarani, Gujarati, Haitian Creole, Hakha Chin, Hausa, Hawaiian, Hebrew, Hiligaynon, Hindi, Hmong, Hungarian, Hunsrik, Iban, Icelandic, Igbo, Ilocano, Indonesian, Irish, Italian, Jamaican Patois, Japanese, Javanese, Jingpo, Kalaallisut, Kannada, Kanuri, Kapampangan, Kazakh, Khasi, Khmer, Kiga, Kikongo, Kinyarwanda, Kituba, Kokborok, Komi, Konkani, Korean, Krio, Kurdish (Kurmanji), Kurdish (Sorani), Kyrgyz, Lao, Latgalian, Latin, Latvian, Ligurian, Limburgish, Lingala, Lithuanian, Lombard, Luganda, Luo, Luxembourgish, Macedonian, Madurese, Maithili, Makassar, Malagasy, Malay, Malay (Jawi), Malayalam, Maltese, Mam, Manx, Maori, Marathi, Marshallese, Marwadi, Mauritian Creole, Meadow Mari, Meiteilon (Manipuri), Minang, Mizo, Mongolian, Myanmar (Burmese), Nahuatl (Eastern Huasteca), Ndau, Ndebele (South), Nepalbhasa (Newari), Nepali, NKo, Norwegian, Nuer, Occitan, Odia (Oriya), Oromo, Ossetian, Pangasinan, Papiamento, Pashto, Persian, Polish, Portuguese (Brazil), Portuguese (Portugal), Punjabi (Gurmukhi), Punjabi (Shahmukhi), Quechua, Q’eqchi’, Romani, Romanian, Rundi, Russian, Sami (North), Samoan, Sango, Sanskrit, Santali, Scots Gaelic, Sepedi, Serbian, Sesotho, Seychellois Creole, Shan, Shona, Sicilian, Silesian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Susu, Swahili, Swati, Swedish, Tahitian, Tajik, Tamazight, Tamazight (Tifinagh), Tamil, Tatar, Telugu, Tetum, Thai, Tibetan, Tigrinya, Tiv, Tok Pisin, Tongan, Tsonga, Tswana, Tulu, Tumbuka, Turkish, Turkmen, Tuvan, Twi, Udmurt, Ukrainian, Urdu, Uyghur, Uzbek, Venda, Venetian, Vietnamese, Waray, Welsh, Wolof, Xhosa, Yakut, Yiddish, Yoruba, Yucatec Maya, Zapotec, and Zulu.