Tuesday, April 29, 2008

Why translations are freezed before development?

Some time back somebody talked to me about how illogical the schedule of fedora is, why do you have the translations deadline before the final development freeze. There is no point in freezing the translations when your development is still going on. Why don't you do it gnome's way? they don't freeze their translations before development!

Having received this unusual query all of a sudden and being far from internet connectivity to check the facts, I had no clear answer by then. But the logic had to be given.

To go by facts, the example about gnome was wrong! Check the gnome's schedule, they have string freeze (the term they use for translations) a lot before the code (development) freeze.

Having worked on both fedora and gnome l10n projects, I had an idea about the logic behind this, but I think it turned blur with the time. The logic is..

Development (or coding) and translations are two different tasks. In all the projects that need translations, have a set of language guys (generally more than the number of languages supported) who do the translations and set of developers who do the coding and packaging of their products. But to actually make the translations available in the software, the developers need to build(or compile) their packages including the translations done by the translators. Obviously you do not want translators from all the various languages to take care of building the package every time each one of them does some work. So to synchronize with the translations, whenever the developer needs to build his package, the translations at that moment of time are also built with it. Now imagine how difficult it would be for a translator to determine whether his translations actually made it to the final product or not, since he might do some work even after the developer has built his package for a particular freeze. Given the number of packages and loads involved with fedora or such products, it is impossible to expect all of the developers to build their packages all at once. So how do the translators get the deadline? Just keep some buffer period between translations and development freeze!

So translators know they have a particular deadline. All the work done till then should be available in the final product. Any work on translations done after that deadline cannot be expected in the current final product, but it will come in the next version of the product.

On the other hand developers can independently compile and build their packages for the final product, and without being worried about the translations done any time during the buffer period, be assured that whatever has been translated before the translation freeze is going into their package.

Otherwise how chaotic it would be for everyone, if developer has compiled the package and translator does some work after that and comes back to the developer asking for recompiling and so on.. The only solution to that would be to freeze everything, all the packages, all the coding and all the translations all at once! But that kind of coincident is far from possible even for products involving limited number of packages with developers spread all around the geography and working independently. Thus you can see why even gnome does it the way fedora does.

Oh, Ubuntu has something interesting too! Their LanguagePackTranslation deadline is same as the Release Candidate. But worth noting is that their 'NonLanguagePack'Translation freeze is preceding the final freeze. Now you guess why?

Tuesday, April 15, 2008

Unicode 5.1 release and Indic changes

Unicode 5.1 release was announced earlier this month on 4th April. Here I have put a diff taken of Unicode 5.1 character database against that of Unicode 5.0. My buddy, Parag also did a nice job of summarizing the Indic specific changes, that I am trying to restate now.

So, here go the updates on Indian scripts UCD:

A. New Indic Scripts Added to Unicode:

1. LEPCHA:

Lepcha is a language spoken by the Lepcha people in Sikkim in India,and parts of Nepal and Bhutan. The Lepcha script (also known as "róng") is a syllabic script which has a lot of special marks and requires ligatures. Its genealogy is unclear. Early Lepcha manuscripts were written vertically, a sign of Chinese influence. Lepcha is considered to be one of the aboriginal languages of the area in which it is spoken.

Total number of speakers numbers near 50,000.

Unicode Range =>U1C00 to U1C4F

Chart URL => http://www.unicode.org/charts/PDF/U1C00.pdf

2. OL-CHIKI:

The Ol Chiki script, also known as Ol Cemetʼ ("language of writing"), Ol Ciki, Ol (and sometimes as the Santali alphabet), was created in 1925 by Pandit Raghunath Murmu for the Santali language. Santali is a language in the Munda subfamily of Austro-Asiatic, related to Ho and Mundari. It is spoken by about six million people in India, Bangladesh, Nepal, and Bhutan[citation needed]. Most of its speakers live in India, in the states of Jharkhand, Assam, Bihar, Orissa, Tripura, and West Bengal. It has its own alphabet, known as Ol Chiki, but literacy is very low, between 10 and 30%. Santali is spoken by the Santals.

Unicode Range => U1C50 to U1C7F

Chart URL => http://www.unicode.org/charts/PDF/U1C50.pdf

3. SAURASHTRA :

Saurashtra, more correctly, Sauraṣṭri or Sauraṣṭram or Sourashtra, also known as Palkar, Sowrashtra, Saurashtram, is an Indo-Aryan language spoken in parts of the Southern Indian State of Tamil Nadu. The Saurashtra community is referred to by the same name, or sometimes by the Tamil name, Pattunoolkaarar. The Ethnologue puts the number of speakers at 510,000 (1997 IMA), although the actual number could be double this figure or even more.

Unicode Range => UA880 to UA8D9

Chart URL => http://www.unicode.org/charts/PDF/UA880.pdf


B. Updates to Existing SCripts in Unicode:

1. DEVANAGARI (2 New Characters):

0971; SIGN HIGH SPACING DOT
0972; LETTER CANDRA A


2. GURMUKHI (2 New Characters):

0A51; SIGN UDAAT
0A75; SIGN YAKASH


3. ORIYA (3 New Characters):

0B44; VOWEL SIGN VOCALIC RR
0B62; VOWEL SIGN VOCALIC L
0B63; VOWEL SIGN VOCALIC LL

4. TAMIL (1 New Characters):

0BD0; OM

5. TELUGU (13 New Characters):

0C3D; SIGN AVAGRAHA
0C58; LETTER TSA
0C59; LETTER DZA
0C62; VOWEL SIGN VOCALIC L
0C63; VOWEL SIGN VOCALIC LL
0C78; FRACTION DIGIT ZERO FOR ODD POWERS OF FOUR
0C79; FRACTION DIGIT ONE FOR ODD POWERS OF FOUR
0C7A; FRACTION DIGIT TWO FOR ODD POWERS OF FOUR
0C7B; FRACTION DIGIT THREE FOR ODD POWERS OF FOUR
0C7C; FRACTION DIGIT ONE FOR EVEN POWERS OF FOUR
0C7D; FRACTION DIGIT TWO FOR EVEN POWERS OF FOUR
0C7E; FRACTION DIGIT THREE FOR EVEN POWERS OF FOUR
0C7F; SIGN TUUMU

6. MALAYALAM (17 New Characters):

0D3D; SIGN AVAGRAHA
0D44; VOWEL SIGN VOCALIC RR
0D62; VOWEL SIGN VOCALIC L
0D63; VOWEL SIGN VOCALIC LL
0D70; NUMBER TEN
0D71; NUMBER ONE HUNDRED
0D72; NUMBER ONE THOUSAND
0D73; FRACTION ONE QUARTER
0D74; FRACTION ONE HALF
0D75; FRACTION THREE QUARTERS
0D79; DATE MARK
0D7A; LETTER CHILLU NN
0D7B; LETTER CHILLU N
0D7C; LETTER CHILLU RR
0D7D; LETTER CHILLU L
0D7E; LETTER CHILLU LL
0D7F; LETTER CHILLU K

All the New Unicode Charts can now be found here:

http://www.unicode.org/charts/


Changes to Tamil and Malayalam have a lot more to discuss than just additional characters. On one side, I think Tamil community would be happy about Unicode rewarding Tamil Named Character Sequences to simplify the script processing, on other side, Malayalam community is not so happy about the Atomic Chillu Characters. Here is their opposition.

I am myself very happy about the 0972 (Letter Candra A) being added to Devanagari. This will help fixing the 'Apple' and 'Anaconda' for Marathi. Also, the inclusion of Ol-Chiki script is a very good initiative.

There is actually a lot of work to be done related to all these changes, ranging through fonts, rendering, keymaps, locales etc. I will have to come up with the details of all that very soon.

Monday, April 14, 2008

Do not buy gifts from Big Bazar

..on Sundays.

This is a bit off topic, but just to warn others who might be interested.

If you are on a weekend shopping and are planning to buy some gift items, and don't want to give the gifts without proper covering, make sure if the shopping center or the mall wherever you are shopping is providing Gift Wrap service or not.

In a peculiar event this Sunday, the shopping assistant at Big Bazar (Vashi) informed us that the articles will be gift wrapped at their Costumer Service counter free of cost. But after clearing the cash counter, when we reached the C.S. counter, we were denied of the service saying that it is not provided on Sunday. huh? If there would have been a huge crowd, it could be still understood. But with their decaying reputation, this Sunday wasn't much busy at BB. The man on the counter was so much reluctant to provide the service that he actually offered us to take the goods back, sending us back to the floor we got it from and returning the money, but not to wrap it!!

This isn't the only example of arrogant behavior of BB staff. Every time you visit, someone will be found troubled. One of such very common observation is, Eight out of Ten cash counters are handled by ill-trained people causing each customer to spend Ten odd minutes resolving issues with the billing machine.

I cannot expect professionalism at these shopping centers to improve due to this blog post (there are already so many other online reviews), but at least some other customers will come to know about such small precautions.