One of the things I often hear from the VC world about the Indian language space is – UGC is lacking, entering text in languages is difficult.
Regarding UGC all I would say is that Oneindia.in has seen a growth in language UGC (comments, article contribution, feedback emails).
Indian Language Fonts
Every language, including English, has multiple fonts. In Indian languages the most popular ones now are Unicode fonts which are free. The number of fonts in Unicode are limited, i.e. the number of font faces are limited. Whereas in English you have many font faces (e.g. Times Roman, Arial, Vardana etc).
Unicode fonts have gained popularity since 2005 or 2006. The main advantage of Unicode font is that you need not have that font installed on your system to render the language characters.
Online publishers/sites find Unicode useful because all search engines can index Unicode content. These days search engines are indexing non-Unicode text, converting it to Unicode and displaying them in search results. At a point we were using non-Unicode fonts, with the help of TCS we converted all our content to Unicode (over a million pages – a herculean task)
This post will explore the current solutions available for entering text in an Indian language. There are three well known input methods,
- Transliteration/Phonetic keyboard
- Inscript/Language keyboard
- Soft/Virtual keyboard
This is also known has phonetic keyboard. Let us take the example of Hindi phonetic keyboard. Few examples of these keyboards are: Oneindia.in, Google, Quillpad. I quote an example from Google pages below,
This lets you type Hindi words phonetically in English script and still have them appear in their correct alphabet. Note that this is not the same as translation — it is the sound of the words that are converted from one alphabet to the other, not their meaning. For example, typing “mahesh” transliterates into Hindi as: महेश
In a nutshell one needs to know English to use the transliteration keyboard.
InScript (Indian Script) is a touch typing keyboard layout scheme for inputting Indic text on computer. This keyboard layout, developed by C-DAC, is standardized by Government of India for Indic Computing.
InScript keyboard now comes inbuilt in all of the newer operating systems including Windows (2000, XP, Vista), Linux and Macintosh.
This layout uses the standard 101 keyboard. The Hindi Indic keyboard gives you an idea about the layout. The mapping of the characters is such that it remains common for all the Indian languages (written left to right).
Due to the phonetic/alphabetic nature of the keyboard, a person who knows typing in one Indian script can type in any other Indian script. This is because of the fact that the basic character set of the Indian languages is common. For e.g. in all languages the character ‘c’ translates to ‘ma’ in that language (in Hindi it would be म ). The advantage of Inscipt keyboards is you need to use lesser number of key strokes compared to the phonetic one.
You can try Oneindia.in’s Inscript and Phonetic keyboard at Oneindia.in
For Language+Technical Geeks
We can divide the characters of Indian language alphabets into Consonants, Vowels, Nasals and Conjuncts. Every consonant represents a combination of a particular sound and a vowel. The vowels are representations of pure sounds. The Nasals are characters representing nasal sounds along with vowels. The conjuncts are a combination of two or more characters. The Indian language alphabet table is divided into Vowels ( Swar) and Consonants (Vyanjan). The vowels are divided into long and short vowels and the consonants are divided into vargs.
The INSCRIPT layout takes advantage of these observations and thus the organization is simple. In the Inscript keyboard layout, all the vowels are placed on the left side of the keyboard layout and the consonants, on the right side. The placement is such that the characters of one varg are split over two keys.
Majority of Oneindia.in’s surfers use Windows XP. We have published detailed instructions to activate the language keyboard on our help pages. By using Left Alt+Shift we can switch between English and Hindi (or the respective language).
Virtual keyboards are also known as soft keyboards. You usually see these on online banking websites where the user can enter the login+password using the virtual keyboard instead of the hard keyboard. Lipikaar is one of the popular phonetic+soft keyboard providers.
Personally I find it difficult to use soft keyboards. I guess it is because you don’t get the “feel” while typing.
Wehave five Indian languages at Oneindia.in – Hindi, Kannada, Malayalam, Tamil and Telugu. Our content team uses Inscript keyboard. They have a utility which switches the keyboard to the language keyboard. Any new member joining the team is given 2-3 days of training before they can start contributing content. From what I understand it is not that complicated to learn the language/inscript keyboard.
We use Google transliteration tool in all our comments text editor (at the end of each article). In click.in, we provide Inscript and phonetic keyboard to enter the classified in Indian languages.
There are sufficient tools to input text in Indian languages. It would be useful for the tool providers (including gmail) to support all the input methods provided in this post. This would ensure majority of online users being able to input/type language text.
Do check out Oneindia’s Indic editor, play with different kinds of keyboards!
Special thanks to Oneindia.in’s editor Harikrishnan and Wikipedia for providing inputs for this post.