unicode agenda for bangla unicode agenda for bangla
play

Unicode Agenda for Bangla Unicode Agenda for Bangla Unicode Agenda - PowerPoint PPT Presentation

Unicode Agenda for Bangla Unicode Agenda for Bangla Unicode Agenda for Bangla Unicode Agenda for Bangla Bidyut Baran Chaudhuri Bidyut Baran Chaudhuri Society for Natural Language Technology Research Society for Natural Language Technology


  1. Unicode Agenda for Bangla Unicode Agenda for Bangla Unicode Agenda for Bangla Unicode Agenda for Bangla Bidyut Baran Chaudhuri Bidyut Baran Chaudhuri Society for Natural Language Technology Research Society for Natural Language Technology Research & Indian Statistical Institute, Indian Statistical Institute, Kolkata, India L2/09-294

  2. Indian Script and Bangla Indian Script and Bangla  Most Most Indian Indian Scripts Scripts are are derived derived from from Ancient Ancient Brahmi Brahmi script script script script.  They They are are alpha alpha- -syllabary/abiguda syllabary/abiguda class class of of scripts scripts. .  Indian  Indian Indian writing Indian writing writing system writing system system started system started started to started to to evolve to evolve evolve 3000 evolve 3000 3000 years 3000 years years ago years ago ago ago.  Perhaps Perhaps inspired inspired by by Ancient Ancient Aramic, Aramic, but but have have exceptional exceptional originality p originality of g g y y of Indian Indian philologists philologists. p p g g .  Alphabet Alphabet matrix matrix is is arranged arranged according according to to manner manner of of articulation articulation like like unvoiced unvoiced (unaspirated, (unaspirated, aspirated), aspirated), voiced voiced i i d d ( (unaspirated, ( (unaspirated, i i d d aspirated) aspirated) i i d) d) versus versus place place l of of f articulation in articulation in mouth mouth like like velar, velar, post post- -alveolar, alveolar, alveolar, alveolar, dental dental and dental dental and and bilabial and bilabial bilabial. bilabial.

  3. Brahmi Alpha Numerals Brahmi Alpha Numerals

  4. From Brahmi to Bangla From Brahmi to Bangla  Full Full- -blown blown Brahmi Brahmi script script was was active active during during the the days days of of Christ Christ but Christ, but Christ, but its but its its initial its initial initial form initial form form started form started started earlier started earlier earlier earlier.  It It branched branched into into north north and and south south Indian Indian groups groups. .  By  By By 800 By 800 800 AD 800 AD AD a north AD a north north variety north variety variety named variety named named Kutila named Kutila Kutila script Kutila script script evolved script evolved evolved evolved through through Kushana Kushana- -Gupta Gupta group group of of scripts scripts. .  Kutila Kutila means means complicated complicated (the (the upper upper- -caste caste people people did did not not like like the the lower lower- -caste caste people people to to learn learn writing writing and and reading) reading). .  B 1000  By By 1000 1000 AD 1000 AD AD proto AD proto proto Bangla proto-Bangla Bangla script Bangla script script e ol ed script evolved e ol ed evolved.  Proto Proto modern modern Bangla Bangla script script evolved evolved by by 1500 1500 AD AD. . th century th century 18 th 18 th  By  By By 18 By 18 century modern century modern modern Bangla modern Bangla Bangla script Bangla script script was script was was ready was ready ready There ready. There There There were were 34 34 consonants consonants and and 10 10 vowels vowels. .

  5. Contd… Bangla Script Evolution Bangla Script Evolution

  6. Stabilization of Bangla Script Stabilization of Bangla Script  Printing Printing in in Bangla Bangla started started in in late late eighteenth eighteenth century century ( (Halhed, 1778 (Halhed, 1778). )  Full Full stop stop and and double double full full stop stop were were only only punctuation punctuation marks marks noted noted in in initial initial script script. .  Other Other punctuation punctuation marks marks were were borrowed borrowed from from English English.  Vidyasagar Vidyasagar introduced y y g g introduced three three more more characters characters in in mid mid nineteenth nineteenth century century by by placing placing dot dot below below three three existing existing characters characters. .  Some Some characters characters like like li li and and double double- -li li became became obsolete obsolete. .  This This stabilized stabilized script script system system remained remained in in use use for for 150 150 years years. .

  7. The Alphabet Currently Used for Bangla The Alphabet Currently Used for Bangla

  8. Further Modification of Further Modification of Bangla Script Bangla Script  After After 1900 1900 AD AD Spelling Spelling correction correction and and script script correction correction debates debates gained gained momentum momentum. .  Several  Several Several correction Several correction correction suggestions correction suggestions suggestions were suggestions were were were accepted accepted accepted accepted through through the the initiative initiative of of Kolkata Kolkata University University. .  New New Decimal Decimal monetary monetary system, system, weighing weighing standards standards etc t t d d d d etc were t were introduced i t introduced around i t d d d d around 1960 d 1960 1960 1960s. .  Some Some of of the the older older signs signs and and symbols symbols disappeared disappeared. .  Simplification  Simplification Simplification Simplification in in in in Representation Representation Representation Representation of of of of conjunct conjunct conjunct conjunct characters characters are are being being proposed proposed since since twenty twenty years years. . There is There is still still debate debate on on which which should should be be simplified simplified.

  9. Development of Bangla ISCII and Unicode Development of Bangla ISCII and Unicode  ISCII for Indian Languages were developed in 1980’s ISCII for Indian Languages were developed in 1980’s g g g g p p through the initiatives of Dept. of Information through the initiatives of Dept. of Information Technology, Govt. of India. Technology, Govt. of India.  Bangla script too got an ISCII version. Bangla script too got an ISCII version.  There has always been some problems in using Bangla There has always been some problems in using Bangla ISCII for preparing electronic texts. ISCII for preparing electronic texts.  The Bangla UNICODE code points appear to be based The Bangla UNICODE code points appear to be based mainly on Bangla ISCII. mainly on Bangla ISCII.  So, it has problems too, though some of them are So, it has problems too, though some of them are already solved. already solved. l l d d l l d d

  10. Unicode 5.1 for Bangla Unicode 5.1 for Bangla

  11. Unicode 5.2 for Bangla Unicode 5.2 for Bangla

  12. Problems Remaining Problems Remaining  R endition  R endition endition of endition of of Hasanta of Hasanta Hasanta and Hasanta and and two and two two types two types types of types of of conjunct of conjunct conjunct r + conjunct r ja ja is is clumsy clumsy with with ZWJ ZWJ and and ZWNJ ZWNJ code code points points. .  No No code code point point exists p exists for for (Khiya ( ( (Khiya or y y or Jukta Jukta- -kha kha ) as ) as well well as as the the Om Om- -kar kar character character . .  Unnecessary Unnecessary existence y existence of of a a code code point point for p for right right g side side of of ou ou- -kar kar . .  No No code code point point exists exists for for Urdha Urdha- -comma comma . .  Existence Existence of of many many code code points points for for old old and and obsolete obsolete symbols symbols in in the the main main code code table table. .  Unreasonable Unreasonable proposal proposal of of introducing introducing extra extra code code for for transparent transparent and and non non- -transparent transparent form form of of vowel vowel modifiers modifiers difi difi . .  Code Code points points for for various various signs signs need need discussion discussion. .

  13. Our Proposals Our Proposals 1. Introduce a code point for 1. 1 Introduce a code point for I I d d d d i i f f in the table, after ie, at 09BA i i in the table, after ie, at 09BA h h bl bl f f i i 09BA 09BA and for at 09D0. and for at 09D0. 2. 2 2. Introduce a new code point for Ja Introduce a new code point for Ja Introduce a new code point for Ja fala ( ) say after ( Introduce a new code point for Ja-fala ( ) say after ( ) i.e. at fala ( ) say after ( ) i.e. at fala ( ) say after ( ) i e at ) i e at 09C9 and use this to express all kinds of Ja 09C9 and use this to express all kinds of Ja- -fala. The existing fala. The existing role of hasant and ZWNJ will continue. E.g. role of hasant and ZWNJ will continue. E.g. There will be no need for ZWJ code point in this scheme. There will be no need for ZWJ code point in this scheme. Contd..

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend