Embracing artificial intelligence to preserve dying languages
As more language preservation initiatives inspired by artificial intelligence emerge, researchers argue that while they cannot fully capture the essence of language, they are a crucial aspect of preserving what is now deemed to be a global linguistic catastrophe.
By Bob Koigi
In Chile, 92 year old Cristina Calderon is the last speaker of the native Yamana language that has for millennia been spoken by the indigenous Yagan community of South America. Up until 2005 she could speak the language with her sister who passed away leaving her with no one to communicate. While Cristina is not the only person of the Yagan community who can speak the language, she indicated in an earlier interview that young generations had shunned the language preferring languages they consider modern. Cristina, the only living repository of the Yagán language and culture, expresses fear that she will die with the language.
Miles away in Kenya’s Rift Valley region, Yakunte, a language spoken by the Yaaku tribe, an offshoot of the Maasai community, is on its deathbed with less than seven people who are over 70 years old speaking it. The young generation identifies with the larger Maasai culture and language. It is one of the seven indigenous Kenyan languages that The United Nations Educational, Scientific and Cultural Organisation has classified as extinct.
The UNESCO Atlas of the World’s Languages in Danger approximates that of the 6,000 languages spoken globally at least 43 percent of them are endangered or about to be extinct. Every two weeks a minority language dies. Further reports indicate what researchers and linguists describe as a worrying phenomenon in preservation of culture and traditions that have defined the diversity of mankind over the years.
Aware of this threat, the UN declared 2019 the International Year of Indigenous Languages even as it sought to highlight the need to preserve heritage and culture: "Through language, people preserve their community’s history, customs, and traditions, memory, unique modes of thinking, meaning and expression. They also use it to construct their future. Language is pivotal in the areas of human rights protection, good governance, peace building, reconciliation, and sustainable development,” reads a statement from the website.
Globalisation and technology have been blamed for the disappearance of these languages. It is estimated that close to half of all online content globally is in English and Chinese, leaving little space for digital adoption of other languages.
“Language is the fulcrum and axis of any culture and identity of any individual,” says Harry Kiema from the University of Nairobi Department of Linguistics & Languages. “It is the definition of humankind and what surrounds them. The preservation of culture is passed on across generations by word of mouth, through language. Which is why the loss of any language no matter how many people speak it, should be a cause for concern. The world and the diversity as we know it has been made possible because of a mix of cultures and languages.”
But the same technology that has been criticised for its role in language dearth is now being hailed for its role in saving the last of the native languages.
Artificial intelligence, AI, has been hailed as a panacea that continues to parse the challenges of language preservation and translation. Through machine learning, the technology has proved effective in processing and storing data at impressive speeds while identifying patterns and being able to create new ones.
As a result artificial intelligence has ensured that languages at the risk of extinction are easily accessible through addressing translation differences that may occur with these languages. Multinationals have invested heavily in this field, aware of the pivotal role that language plays in human life. Microsoft for example runs Microsoft Translator Hub, a platform that allows communities and institutions to tap into the neural text and speech translation systems to come up with their own translation tools.
Google is also working with developers and institutions to develop unique translation modules using open source AI platforms like TensorFlow that save millions of transcribing hours. Gabriel Emmanuel, a Nigerian innovator, has been working on an AI platform christened OBTranslate that seeks to address barriers to communication by translating over 2,000 African languages while preserving them for posterity.
“I was motivated to embark on this initiative because Africa has rich cultures and diverse languages,” Emmanuel told FairPlanet in an earlier interview. “We have over 2,000 languages in 54 countries in Africa, 63% of total Sub-Sahara’s population live in rural areas, and they all speak in more than 2,000 languages in these regions.”
“Also, there are over 52 native languages in Africa, which have undergone language death, have no native speakers and no spoken descendants. As a young inventor in the field of ICT and Robotics, I believe building innovation [Robots] aimed at solving local problems should not be limited to just English language.”
ARC Centre of Excellence for the Dynamics of Language (CoEDL), an Australian institution working to conserve endangered languages on the continent, has designed Opie, a robot built on open source AI platform TensorFlow that teaches indigenous languages to children through lessons, stories and games. The robot then monitors and records the children’s learning skills to allow their tutors to monitor their progress.
In New Zealand, Jason Lovell a student struggling to learn the native Maori language, developed a Facebook chatbot dubbed Reobot using artificial intelligence that understands and replies to messages through both Maori and English. Lovell hopes to upgrade the chatbot to handle spoken language as he looks to new ways of preserving his indigenous language and sharing it with the world.
And as more language preservation initiatives inspired by artificial intelligence emerge, researchers argue that while they cannot fully capture the essence of language, they are a crucial aspect of preserving what is now deemed to be a global linguistic catastrophe.
“There are underlying challenges in using technology to preserve native languages because most of these languages mixed written and spoken word,” says Harry Kiema. “The pronunciations, diction and facial expressions cannot be captured even by these latest technologies. Meanings are therefore bound to be lost in translations. However to have certain aspects of languages reserved in digital repositories that cannot be erased is in itself a huge landmark.”
This article was originally published in November 2020 on the website of FairPlanet, a social enterprise founded in 2014 in Berlin. Its aim is to promote human rights, protect our biosphere and support Sustainable Development Goals across the globe.