Skip to main content

The Indic Mashup

This weekend got contributed for the first Indic Mashup workshop. The idea initiated by Karunakar finally took shape inside Red Hat premises at Pune.

The participants were expected to come from various language communities. So most of the Red Hat's Localization team appeared on a Sunday morning. It would have been great if more linguists and i18n contributors around Pune and Mumbai would have participated. But except Karunakar and Localization team, the only linguist present was Ravi Pandey who is a font designer and Marathi, Sanskrit expert. Still the crowd of 13 and the issues were good enough to discuss and work upon for 10 to 5 schedule.

Various issues from keyboard layout to collation tables, a lot got discussed. I thought related bug reports could also have been filed at appropriate places, but that might need more focussed workshops in near future. Now we are clear what issues are there and what can be done for them. I think this is a very good achievement for now.

So far, most of the events I have seen were mostly like someone presenting and others listening. This was certainly different. People had actually got into source codes and bugzillas. Testing for most of the Indic related tools was done across distributions. Reports got generated for upcoming features like collation. The collation sequences for various languages with reference to unicode provided collation charts and proposed modification/additions to that are posted here. The only missing for now would be bengali, assamese, tamil and oriya. Now that Assamese, Tamil and Oriya collations are already implemented, a thorough testing could have been done, but the absence of Language experts limited this task.

Among other issues, the important ones that have given few todo's would be the ones related to the keyboard layouts. Few characters like Om, ZWJ, ZWNJ have been either absent or not placed uniformly among various languages. We tried to find out vacant places on current layouts to accommodate these new additions for one-to-one mapping(xkb). Other options like sequence of keys are kept for scim. There is lot more that got discussed which would be worth for filing bugs.

Among feedbacks, the most obvious one, it was different. It was a workshop in real and not just a speech and presentation. Being first of this kind, it was more generalized. Request are now coming up for more focussed ones in future. focusing particular field like input methods or fonts, focusing particular language and so on.

Among few negative feedbacks, one was the weekend work. Not many like the idea of spending a lovely Sunday for such a tensed work. It would be great if it can be managed during weekdays, not as big as this but preferably shorter sessions of couple of hours.

One more comment I received was that the Indic mashup sounded more like Marathi mashup. This was expected to some extent. I would rather say it concentrated more on Devanagari issues. But this was inevitable. Only linguist present was a Marathi and Sanskrit expert. There were two maharashtrian i18n engineers(including myself), one Marathi language maintainer and one Hindi expert and others who understood Devanagari better. For other languages only one developer was present per language. In my personal opinion there could have been more involvement from other language developers. More issues and requirements could have been raised. But nevertheless even though most of the references were done using Devanagari, most issues were common in general for rest of the scripts and languages.

About venue and facilities, WiFi worked very well. IRC was use mainly for sharing URLs and satisfying the curiosity of few remotees. Except for a delayed lunch, food and all worked out well too.

To sum up I can say this event proved to be productive and I would be looking forward for few more focussed and comparatively shorter sequels.

Comments

Popular posts from this blog

Unicode 5.1 release and Indic changes

Unicode 5.1 release was announced earlier this month on 4th April. Here I have put a diff taken of Unicode 5.1 character database against that of Unicode 5.0. My buddy, Parag also did a nice job of summarizing the Indic specific changes, that I am trying to restate now. So, here go the updates on Indian scripts UCD: A. New Indic Scripts Added to Unicode: 1. LEPCHA: Lepcha is a language spoken by the Lepcha people in Sikkim in India,and parts of Nepal and Bhutan. The Lepcha script (also known as "róng") is a syllabic script which has a lot of special marks and requires ligatures. Its genealogy is unclear. Early Lepcha manuscripts were written vertically, a sign of Chinese influence. Lepcha is considered to be one of the aboriginal languages of the area in which it is spoken. Total number of speakers numbers near 50,000. Unicode Range =>U1C00 to U1C4F Chart URL => http://www.unicode.org/charts/PDF/U1C00.pdf 2. OL-CHIKI: The Ol Chiki script, also known

PVR is so wierd!

Yesterday we went second time to a mall bit far from office to complete the earlier failed mission of watching this 3D movie, Clash of the Titans. On ticket counter, we were first told that evening show was house full. Then we asked for a night show, and were told there isn't any show then and the gentleman handed us the pamphlet of all movie schedules. We checked on the nearby digital kiosk and also on the printed schedule to be sure of the show timings. Then went to second counter, and asked the lady for the night show tickets, and without any problem got the tickets for back seats. In fact this show was hardly 20% full, wonder how the evening show became houseful. But the biggest wonder/blunder is yet to come. On the entrance we were stopped for having a laptop bag along with (we had went straight after the office). In spite of having checked the bag, we were not allowed, because laptops were not allowed inside! Then we asked for keeping it at the baggage counter. But then, the

What is so wrong with Bhagwad Geeta?

Here's a discussion I had with someone over Bhagwad Geeta on TOI forum (Stop reading now if you don't want to go to the end, it may mislead): mukunda (Bengaluru) replies to Siddharth 21 Jul, 2011 02:50 PM Ok,lets read ch 4 verse 13. catur-varnyam maya srstam guna-karma-vibhagasah tasya kartaram api mam viddhy akartaram avyayam "According to the three modes of material nature and the work associated with them, the four divisions of human society are created by Me. And although I am the creator of this system, you should know that I am yet the nondoer, being unchangeable." 1st line"catur-varnyam maya srstam" 4 varnas are created by Me(Paramatma),2nd line "guna-karma-vibhagasah" where the vabhajan\categorization is based on one's guna composition and karma composition. 3rd and 4th line states how He is the non doer and unchangable. Sri Krishna says that each living entity is categorized into one of the 4 varnas based ONLY on their pre