The internet has made a tremendous difference in people?s lives across the world. The gathering of information by different means has now given way to online commerce and social networking. This has happened via the evolution of a cyberspace, wherein people can connect from diverse geographical locations. Cyberspace has grown extensively in the last two decades and today, of the world?s 6.7 billion population, more than 1.7 billion people are in cyberspace. Roughly, this corresponds to 25% of the total world population.

One of the lesser followed traits of such growth is that, despite there being about 6,000 languages across the world, about 98% of all internet content is covered by just 12 languages. English dominates, with more than 478 million users?which is roughly about 37% of the total internet population. It?s followed by Mandarin, which accounts for about 383 million users and Spanish is a distant third with 136 million users. These three languages account for almost 57% of the global total internet penetration, with 27% of English speaking and 22% of Mandarin speaking populations being present on the internet. Hindi does not come even in the top ten, nor do any of the other Indian languages.

On the internet, language can be generally measured under two contexts?the domain names registered and the content flowing therein. Domain names generally refer to the addresses in cyberspace and the most generic ones?called the generic top-level domains (gTLDs)?are all in English. Eight such gTLDs like dot-com, dot-edu, dot-gov etc have been in vogue from the early days of the internet, long before the non-profit Internet Corporation of Assigned Names and Numbers (ICANN) was set up in 1998. Today ICANN manages the domain addressing system and has done this very efficiently so far. Today, there are 21 gTLDs. Plus, there are around 250 two-letter country code TLDs (ccTLDs). Most of the content in these 250 domains is in languages other than English. In 2008, around 177 million TLDs were registered, of which 96 million were gTLDs.

Clearly, internet content in other languages wasn?t growing at the same pace as in English. Traffic is still dominated by English language emails and social networking. Commerce transactions are growing but at a very slow pace. In many emerging economies where internet penetration is improving, people are still shy about sharing financial details over the medium. However, in the business-to-business segment, much of the content is flowing through the internet.

UNESCO and the International Telecommunications Union (ITU) have promoted a multi-lingual cyberspace. The 2005 world summit of the Information Society in Tunis identified multilingualism as a means to bridge the global digital divide. Both these UN organisations have tried to advance multilingualism in areas like domain names, email addresses and keyword look-ups.

One of the major factors that can help improve the internet is the availability of content in local languages. This is very relevant in the case of least developed nations and particularly for India. Here, internet has grown very strongly and government policy has worked as an enabler. With so many languages and dialects circulating across the country, India?s cyberspace growth will be marked by improvements in local language content. As a first step, the government has funded projects to develop local language fonts. This has been expanded into a project called Technology Development in Indian Language (TDIL), under the Department of Information Technology of the central government. Under TDIL, the tasks undertaken are developing information processing tools and techniques to facilitate human-machine interaction without language barriers, creating and accessing multilingual knowledge resources and integrating them to develop innovative user products and services. Right from the NDA government?s time, developing content in local languages has been in focus. And this function was further highlighted when community information centres were set up in northeast India and were found to be wanting in content in local languages. Under TDIL, the focus on translation support systems has been really impressive ?Anglabharti is a multi-lingual machine-aided translation methodology tool that allows the translation of English into major Indian languages. Unfortunately, much of government functioning is still in the English language and on paper files. It will take some time for the impact of language technology to be fully realised across the country.

The fact remains that content in cyberspace will grow. What remains to be seen is how future content shifts to various popular languages and also how easily content can be translated globally. Only this will make the experience of internet much more inclusive.

?The author is country head, General Dynamics. Views are personal