Skip to main content


Showing posts from 2008

Outputs from (new locale)

"Show me the code" is really showing its outputs. With help from Gora Mohanty and Ravishankar Shrivastava, we now have a new Chhattisgarhi (hne_IN) locale defined in glibc with changelog:
2008-12-05 Ulrich Drepper


* locales/hne_IN: New file.
Contributed by Pravin Satpute .

Thanks Pravin and Urlich for making it upstream,

I am sure there is a lot more to come in near future.

- Cheers!

Pics from

We didn't have a great camera and skills with us but clicked every moment that looked interesting and worth saving. Here I have uploaded few pics taken from either Pravin's digicam or my cellphone cam. Lookout for ones that are tagged with 'fossin2008'. Apart from, there are few taken while wandering around in the city of Bangaluru. There are few more, but they would better fit in my orkut profile. cheers :)

Wrapping up at

With the start of 5th day, starts the end of a journey. A journey of talks, presentations, discussions, BoFs and of course the workouts at accompanied by a nice not-too-hot-not-too-cold rainy climate in Bangalore, negated by the depressed concerns about the terrible happenings back in Mumbai.

About the event, my personal highlights have been the talks, workouts and BoFs around Indic computing. It is interesting how the language computing forms a significant part of almost any foss event in India. The collation workout, Indic BoF, talks on text-to-speech, speech recognition, machine translations all went fine along with my own talk about language i18n support accompanied by Pravin on very first day. On other notes, the Nokia stall very well showcased maemo and N810, with the talks inline with the applications it runs.

The workout on collation helped update the status of all the Indic sorting and proceed with the remaining ones like Malayalam. Bengali still remains uncertain. It …

Workout on Internationalization

They say, "i18n is not a feature, its an architecture". Its not about downloading and configuring stuff around thats available somewhere in pieces, its about creating something that was not there at all.

Me and Pravin have been recently working on getting few of the yet-to-be (digitally) visible languages to meet a minimum technical support/usability criteria. On this background both of us came up with an idea of having a talk or demo on this at 2008. But looking at the amount of information and practical work involved with the entire thing, a workout looked like a more sound option. So we proposed the workout with title, "Creating Language Support Architecture (i18n) For A New Language On Desktop" which is right now in the first shortlist.
The plan is to start with essential theory and some demo about the work, followed by some real work with the help of participants. The proposed abstract is as follows..

Aim is to guide developers from languages that still n…

Proposing Minimum Criteria for I18N Support

I think the term 'Language Support' has been used in more vague sense so far (please correct me if I am wrong). Also there has been a difference between a language being supported technically and a language that is supported with all the localization. Thus there has been a need to define the terms with more clarity. I think it would be good if we have two different sections for a language support namely, i18n and l10n.

Recently I have tried to formulate a minimum criteria for a language to have "I18N Support". The same is documented here:

On the same lines we may even have similar criteria for l10n support.

Although the above has been defined from fedora perspective, I hope it would be common for most other distributions. Feedbacks are welcome :-)

Kaarkkodakan fix for Pango

The pango bug #441654 about mprefixups for Malayalam, better known as "Kaarkkodakan" issue has now undergone a history of one year and has caused a lot of pain among Malayalam users. It has also seen a lot of patches like this so far. But none of them neither solve the problem completely nor were they fit for Pango's coding style.

Now, finally I have come up with a patch that looks pretty much generic and does not affect the coding style much. Also the testing done so far proves to be fine. All the test cases reported so far have been fixed up without anything else getting affected. Manilal has also did his bit by doing the testing himself and posting the screen shots. Now if everything goes fine and Behdad's critical eye doesn't spot any problem :-), hope to see this one getting committed for next pango release soon.

Reply to Anivar

It seems like there is some problem with comment posting on Anivar's blog. So I decide to put my reply as a separate blog post. Here it goes:

Few factual corrections and comments:
1. About the pango bug 357790 and the patch on it:
The patch on this bug is a mere clean up version of the patch on bug 121672 which was originally created by LingNing. This was also based on the inputs given by Ani about the grammar of 0d30 and 0d31 which was later resolved (to 0d30 only) through discussions with smc.
Point is not to transfer the responsibility, but to acknowledge that pango genuinely has a problem that it does not behave the way Uniscribe does. Another problem in this case is that, Uniscribe bahavior has changed from its version in XP to Vista and we are yet to fix this bug completely. Anyway, my patch was reverted one year back (see Comment #32 on bug 357790). Ever since then I have urged on concentrating on the original issue which I still continue to. And Lohit was agreed to be fixe…

Lohit, Fedora and Community

First: _Some updates on Lohit Malayalam fonts_
Recently, there has been a huge agitation by Malayalam community about the bugs in the f9 final version of lohit fonts. You can get a glimpse of it here. Most of these were either last minute hickups or not reported at all until then. But whatever it was, the final product could not be buggy. So, within a short span of time, all these bugs (#444559, #444561, #444563) were fixed for Lohit and tagged into for the f9 final. So the version in fedora and latest upstream, lohit-2.2.1 is free of all these bugs. Malayalam users would be able to find the fixes in the following screenshot.
Second: _Some comments on the events going around_
After working for so many years on so many languages for so many different tools and applications, sometimes for some organizations, sometimes just out of passion, I (or should I say we, the language computing guys) have developed this immense love for all the languages of India. They are all rich in their heritage,…

Why translations are freezed before development?

Some time back somebody talked to me about how illogical the schedule of fedora is, why do you have the translations deadline before the final development freeze. There is no point in freezing the translations when your development is still going on. Why don't you do it gnome's way? they don't freeze their translations before development!

Having received this unusual query all of a sudden and being far from internet connectivity to check the facts, I had no clear answer by then. But the logic had to be given.

To go by facts, the example about gnome was wrong! Check the gnome's schedule, they have string freeze (the term they use for translations) a lot before the code (development) freeze.

Having worked on both fedora and gnome l10n projects, I had an idea about the logic behind this, but I think it turned blur with the time. The logic is..

Development (or coding) and translations are two different tasks. In all the projects that need translations, have a set of language g…

Unicode 5.1 release and Indic changes

Unicode 5.1 release was announced earlier this month on 4th April. Here I have put a diff taken of Unicode 5.1 character database against that of Unicode 5.0. My buddy, Parag also did a nice job of summarizing the Indic specific changes, that I am trying to restate now.
So, here go the updates on Indian scripts UCD:

A. New Indic Scripts Added to Unicode:


Lepcha is a language spoken by the Lepcha people in Sikkim in India,and parts of Nepal and Bhutan. The Lepcha script (also known as "róng") is a syllabic script which has a lot of special marks and requires ligatures. Its genealogy is unclear. Early Lepcha manuscripts were written vertically, a sign of Chinese influence. Lepcha is considered to be one of the aboriginal languages of the area in which it is spoken. Total number of speakers numbers near 50,000. Unicode Range =>U1C00 to U1C4F Chart URL =>


The Ol Chiki script, also known as Ol Ce…

Do not buy gifts from Big Bazar

..on Sundays.

This is a bit off topic, but just to warn others who might be interested.

If you are on a weekend shopping and are planning to buy some gift items, and don't want to give the gifts without proper covering, make sure if the shopping center or the mall wherever you are shopping is providing Gift Wrap service or not.

In a peculiar event this Sunday, the shopping assistant at Big Bazar (Vashi) informed us that the articles will be gift wrapped at their Costumer Service counter free of cost. But after clearing the cash counter, when we reached the C.S. counter, we were denied of the service saying that it is not provided on Sunday. huh? If there would have been a huge crowd, it could be still understood. But with their decaying reputation, this Sunday wasn't much busy at BB. The man on the counter was so much reluctant to provide the service that he actually offered us to take the goods back, sending us back to the floor we got it from and returning the money, but not to…

Go round.. but which way?

It all started with Kushal's query about python's strange behavior regarding % operator. But my own digging went on so much that I got tempted to write an entire post than just a reply. Kushal, you might find the answer for your query somewhere at the end.

Integer division is far more interesting than one could imagine. Especially when it comes to negative numbers. Even more interesting are the topics of remainder, modulo and truncation. In general, for given two integers N and D, you would simply divide the absolute values |N| by |D| and if only one of them carries negative sign you would mark the quotient negative otherwise positive, the remainder would carry the sign of N. Now take an example,
-2/3. Your quotient would be 0 and remainder would be -2. Now try the same thing on Python. You will get
-2/3 = -1 and -2%3 = 1.
Now lets rethink of both manual and python's results.
Your manual quotient is zero because the absolute value of N is less than that of D. But thinking in te…

I am freed..

This is a second day in and I can say it was worth being here so far. Most of the people were seen faces at but its always good to meet old friends again and again. Additionally the lower density of people here is a good thing in a way that you get to interact more with the people you want. Something not possible when there is a lot of crowd around.

First day was all spent attending the talks. I spend good time with KK, talking passionately about free software and philosophy. I see the influence of Nagarjuna in him. But he has his won way of saying things. He is a lot aggressive while Nagajuna is a lot humble. Something he admits himself. Anyway there were conclaves going on parallel but so far i.e. the last session of day two, I have not been there.

Today was much special. Gora made sure that I get introduced with Andreas Vox, the scribus developer. It was really nice talk to him about various tech savvy issues related to page layout, text layout and lot more. I did no…

The Indic Mashup

This weekend got contributed for the first Indic Mashup workshop. The idea initiated by Karunakar finally took shape inside Red Hat premises at Pune.

The participants were expected to come from various language communities. So most of the Red Hat's Localization team appeared on a Sunday morning. It would have been great if more linguists and i18n contributors around Pune and Mumbai would have participated. But except Karunakar and Localization team, the only linguist present was Ravi Pandey who is a font designer and Marathi, Sanskrit expert. Still the crowd of 13 and the issues were good enough to discuss and work upon for 10 to 5 schedule.

Various issues from keyboard layout to collation tables, a lot got discussed. I thought related bug reports could also have been filed at appropriate places, but that might need more focussed workshops in near future. Now we are clear what issues are there and what can be done for them. I think this is a very good achievement for now.

So far, mo…

Samyak is in..

The long awaited Samyak fonts are finally in Fedora.

Being one of the major initial projects in my career, Samyak will always remain close to the heart. Thanks to Pravin for packaging them very well in accordance to the font SIG's guidelines and thanks to Parag for reviewing it. Thanks to Sandeep Shedmake as well for driving this in. His contributions for licensing/copyright text correction and few other bug fixes really made it happen. I had plans to put them in fedora's last two releases as well, but something else always kept it away from priority.

Somebody must have said, 'never forget to thank yourself'. So here I am, patting my own shoulder for being the initiator of the design for these fonts. One more reason to fall in love for this, the name 'Samyak', given by me, still remains a source of inspiration and attachment.

Finally how can I forget Dr. Nagarjuna G., who has been the initiator and the guiding force behind the entire project. A special thanks to h…

I, the chief chef

It wont be easy for many people to believe that I could cook, but last night I tried this adventure once again.

The story started out with some heavy snacks in the evening that made me and Sandy very full. So we both were not interested in having a proper dinner. I planned to make something like omelet to carry on the night and avoid going out to a regular hotel. Later on it came out that even Sreedhar Anna was not willing to go out and Pyadu had also came to our place. So plan for my personal omelet got canceled and Anna suggested rice and egg-curry. Since it was Tuesday Pyadu could not have egg due to his own rule to do so. So rice could not be made plain. It had to be something that can go alone. So the Khichadi/Masala Rice was the only option.

Since Anna cannot even make a tea(he can at least boil the water which my some other friend cannot), it had to be me to take the charge of kitchen. It wasn't my first attempt to Khichadi. I had learned a bit about Indian style cooking by …

Weekend networking

Finally I got a dedicated line internet connection installed back home after what felt like extremely tedious and very long three weeks of application, call center talks, complaints and repairs. I had discontinued the same service(You Telecom) some time back due to same problems. But when I felt an urge for it on weekend trips to home, there was no other option available even after a year. I really wanted to give a try to Airtel connection this time. But anyways, the 'broadband' connection is up and giving a 15kbps download speed (out of expected 192kbps).

The cable modem I got with has two ports, one for ethernet cable and other for usb. When I asked their engineer if I can use these two lines to connect two machines at a time, he denied and said it would need a router. That didn't answer my curiosity about two lines and I gave it a try. My Fedora 8 machine picked up an IP which is strangely not belonging to the modem's network. But after setting up the pppoe connecti…

Cursor size and Telugu

Assuming you are using a Gnu/Linux box with input methods(mostly scim) available to input many of the Indic scripts and have Lohit fonts installed. Open gedit and start typing anything. Now keep changing the keyboard layout to write something in each language say Marathi, Gujarati, Gurmukhi, Tamil and so on. Don't worry if you don't know the languages, just type garbage. You will notice that most of the scripts supported by Lohit fonts are more or less scaled to each other.

Now try typing Telugu. Still the size is scaled? Yes, they look matched in size. But did you notice the hight of that cursor? This is an issue with Telugu that even though the font on average looks scaled properly, the cursor size is not. And thus the spacing between two lines is also unexpectedly large. Of course you can reduce the size of the cursor by reducing font size, but it still remains out of proportion to the glyph sizes and thus giving ugly line spacing.

It appears that the cursor size is determine…

Rendering Recommendations draft

So finally, I am giving out this long awaited draft:

It addresses some of the OpenType, Unicode and fonts related issues. Many of the issues discussed here, have been the source of conflicts, especially for ml_IN. Thus it was an utter need to provide a detailed analysis like this. I hope the illustrations made there provide some common guidelines. There is certainly a scope for improvement. I would like to hear from various communities if they want some of the other left out issues to be also addressed.
The draft is open for discussion and feedback.