NEW DELHI — As India celebrates its 69th Republic Day, Microsoft Jan. 25 announced it will bring artificial intelligence and deep neural networks to improve real-time language translation for Hindi, Bengali and Tamil.
With deep neural networks-powered language translation, the results are more accurate and the sound more natural.
"Microsoft celebrates the diversity of languages in India and wants to make the vast Internet even more accessible. We have supported Indian languages in computing for over two decades, and, more recently, have made significant strides on voice-based access and machine translation across languages," said Sundar Srinivasan, general manager, AI and Research, Microsoft India.
"Today's launch is a testament of our quest to bring cutting-edge machine learning technology to democratize access to information for everyone in India," Srinivasan added.
Users can access the benefits of deep neural networks-enhanced Indian language translation while surfing the internet across any website on the Microsoft Edge browser, Bing search and Bing Translator website, as well Microsoft Office 365 products like Word, Excel, PowerPoint, Outlook and Skype.
The Microsoft Translator app in Android and iOS can recognize and translate languages from text, speech and even photos.
Since the early 2000s, Microsoft has been pioneering the traditional statistical machine translation paradigm to translate global and Indian languages.
The incorporation of deep neural networks into translating complex Indian languages has been engineered to bring more accuracy and fluency to translation.
While SMT is limited to translating a word within the local context of a few surrounding words, deep neural networks operate differently as they have the capacity to encode more granular concepts like gender (feminine, masculine, neutral), politeness level (slang, casual, written, formal), and type of word (verb, noun, adjective).
For accurate translations, the system demands millions of parallel sentences in each language pair, in all permutations and combinations.
"However, Indian languages, constituting of Dravidian and Aryan subdivisions, are complicated. The complexities increase while translating languages for India, where 29 different states have 22 official languages," Microsoft said in a statement.
Adding to the challenge was the dearth of digital content in Indian languages, which could be pulled from the Internet to train the neural networks.
"Six Indian languages are part of top 20 global languages by population. Ironically, these languages are not on top of the digital content list. There's not enough material on internet that we could use to train the system," explained Krishna Doss Mohan, senior program manager, Microsoft India, who is part of the team that works on Indian languages.
Despite the obstacles, deep neural networks-powered translation systems have shown significant improvement in both automatic and human evaluation metrics.
"More specifically, we have witnessed at least 20 percent improvement in translation quality for all Indic languages currently supported by Microsoft," the company said.
To transcend the language barrier, Microsoft started working with Indian languages two decades ago and in 1998 launched "Project Bhasha" to accelerate computing in Indian languages.
"We have come a long way since then – supporting text input in all 22 constitutionally-recognized Indian language across our products, and Windows interface support in 12 languages," the company said.
Bhashaindia.com, which provides computing tools for Indic languages, has received 40 million hits to date.