a sanskrit compound processor
play

A Sanskrit Compound Processor . . . . . Amba Kulkarni Anil - PowerPoint PPT Presentation

. . A Sanskrit Compound Processor . . . . . Amba Kulkarni Anil Kumar Department of Sanskrit Studies University of Hyderabad Hyderabad June 21, 2012 . . . . . . 1 / 48 Sanskrit is very rich in compound formation. vedaved a


  1. . . A Sanskrit Compound Processor . . . . . Amba Kulkarni Anil Kumar Department of Sanskrit Studies University of Hyderabad Hyderabad June 21, 2012 . . . . . . 1 / 48

  2. Sanskrit is very rich in compound formation. vedaved¯ a˙ ngatatvaj˜ na pravaramukut .aman .imaric¯ ıma˜ njar¯ ıcayacarcitacaran .ayugula jal¯ adivy¯ ıtv¯ abh¯ apakapr .thiv¯ avapratiyogipr .thiv¯ ıtvavat¯ ı . . . . . . 2 / 48

  3. Sanskrit Compounds . . It is a single word (ekapadam). It has a single case suffix (ekavibhaktikam) with an exception of aluk compounds such as yudhis .t .irah ., where there is no deletion of case suffix of the first component. It has a single accent(ekasvarah .). The order of components in a compound is fixed. No words can be inserted in between the compounds. The compound formation is binary with an exception of dvandva and bahupada bahuvr¯ ıhi. Euphonic change (sandhi) is a must in a compound formation. Constituents of a compound may require special gender or number different from their default gender and number. e.g. p¯ an .ip¯ adam, p¯ acik¯ abh¯ aryah ., etc. . . . . . . . . . . . 3 / 48

  4. Syntactic classification sup¯ a ˙ m sup¯ a ti˙ n¯ a n¯ amn¯ a dh¯ atun¯ atha ti˙ n¯ a ˙ m ti˙ n¯ a | subanteti vij˜ neyah . sam¯ asah . s .ad .vidhoh . budheh . || Subanta (noun) + Subanta (noun) ( r¯ ajapurus . ah . ) Subanta (noun) + Tinanta (verb) ( paryyabh¯ us . ayat ) Subanta (noun) + n¯ ama (nominal base) ( kumbhak¯ arah . ) Subanta (noun) + Dh¯ atu (verbal root) ( kat .apr¯ u ) . ¯ Tinanta (verb) + Subanta (noun) ( kr .ntavicaks . an a ) Tinanta (verb) + Tinanta (verb) ( kh¯ adata -modata) . . . . . . 4 / 48

  5. Semantic classification The Sanskrit compounds are classified semantically into four major types: . Tatpurus .ah . . . . (Endocentric with head typically to the right) . . . . . . Bahuvr¯ ıhih . . . . (Exocentric) . . . . . . Dvandvah . . . . (Copulative) . . . . . . ıbh¯ Avyay¯ avah . . . . (Endocentric with head typically to the left and behaves as an indeclinable) . . . . . . . . . . . 5 / 48

  6. Compound processor . . . Segmentation (Sam¯ asapadacchedah .) 1 . . . Constituency Parsing (S¯ amarthyanirdh¯ aran .am) 2 . . . Type Identification (Sam¯ asabhedanirdh¯ aran .am) 3 . . . Paraphrase generation (Vigrahav¯ akyanirm¯ an .am) 4 . . . . . . 6 / 48

  7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Segmentation (Sam¯ asapadacchedah .) . . . Split a compound into its constituents. tapassv¯ adhy¯ ayaniratam is segmented as . tapas-sv¯ adhy¯ aya-niratam . . . . . . . . . . 7 / 48

  8. . . . . . . . . . . . . . . . . . . . Segmentation (Sam¯ asapadacchedah .) . . . Split a compound into its constituents. tapassv¯ adhy¯ ayaniratam is segmented as . tapas-sv¯ adhy¯ aya-niratam . . . . . Constituency Parsing (S¯ amarthyaniradh¯ aran .am) . . . This module parses the segmented compound syntactically by pairing up the constituents in a certain order two at a time. tapas-sv¯ adhy¯ aya-niratam is parsed as . << tapas-sv¯ adhy¯ aya > -niratam > . . . . . . . . . . 7 / 48

  9. . . . . . . . . . . Segmentation (Sam¯ asapadacchedah .) . . . Split a compound into its constituents. tapassv¯ adhy¯ ayaniratam is segmented as . tapas-sv¯ adhy¯ aya-niratam . . . . . Constituency Parsing (S¯ amarthyaniradh¯ aran .am) . . . This module parses the segmented compound syntactically by pairing up the constituents in a certain order two at a time. tapas-sv¯ adhy¯ aya-niratam is parsed as . << tapas-sv¯ adhy¯ aya > -niratam > . . . . . Type Identification (Sam¯ asabhedanirdh¯ aran .am) . . . Decide the type of a compound at each node of composition. . << tapas-sv¯ adhy¯ aya > Di-niratam > T7 . . . . . . . . . . 7 / 48

  10. . Segmentation (Sam¯ asapadacchedah .) . . . Split a compound into its constituents. tapassv¯ adhy¯ ayaniratam is segmented as . tapas-sv¯ adhy¯ aya-niratam . . . . . Constituency Parsing (S¯ amarthyaniradh¯ aran .am) . . . This module parses the segmented compound syntactically by pairing up the constituents in a certain order two at a time. tapas-sv¯ adhy¯ aya-niratam is parsed as . << tapas-sv¯ adhy¯ aya > -niratam > . . . . . Type Identification (Sam¯ asabhedanirdh¯ aran .am) . . . Decide the type of a compound at each node of composition. . << tapas-sv¯ adhy¯ aya > Di-niratam > T7 . . . . . Paraphrase generation (Vigrahav¯ akyanirm¯ an .am) . . . tapah . ca sv¯ adhy¯ ayah . ca = tapassv¯ adhy¯ ayah . (= tat1) gloss: penance and self-study tasmin niratah . = tapassv¯ adhy¯ ayaniratah . . gloss: who is constantly engaged in penance and self-study . . . . . . . . . . 7 / 48

  11. Compound processor . . . Segmentation (Sam¯ asapadacchedah . ) 1 . . . Constituency Parsing (S¯ amarthyaniradh¯ aran .am) 2 . . . Type Identification (Sam¯ asabhedanirdh¯ aran .am) 3 . . . Paraphrase generation (Vigrahav¯ akyanirm¯ an .am) 4 . . . . . . 8 / 48

  12. Compound Segmenter The task of a segmenter is to split a given sequence of phonemes into a sequence of morphologically valid segments. . . . . . . 9 / 48

  13. Compound Segmenter The task of a segmenter is to split a given sequence of phonemes into a sequence of morphologically valid segments. The compound formation involves a mandatory sandhi. . . . . . . 9 / 48

  14. Compound Segmenter The task of a segmenter is to split a given sequence of phonemes into a sequence of morphologically valid segments. The compound formation involves a mandatory sandhi. Each sandhi rule is a triple (x, y, z) where y is the last letter of the first primitive, z is the first letter of the second primitive, and x is the letter sequence resulting from the euphonic combination. . . . . . . 9 / 48

  15. Compound Segmenter The task of a segmenter is to split a given sequence of phonemes into a sequence of morphologically valid segments. The compound formation involves a mandatory sandhi. Each sandhi rule is a triple (x, y, z) where y is the last letter of the first primitive, z is the first letter of the second primitive, and x is the letter sequence resulting from the euphonic combination. For analysis, we reverse these rules of sandhi and produce y + z corresponding to a x . . . . . . . 9 / 48

  16. Compound Segmenter The task of a segmenter is to split a given sequence of phonemes into a sequence of morphologically valid segments. The compound formation involves a mandatory sandhi. Each sandhi rule is a triple (x, y, z) where y is the last letter of the first primitive, z is the first letter of the second primitive, and x is the letter sequence resulting from the euphonic combination. For analysis, we reverse these rules of sandhi and produce y + z corresponding to a x . Only the sequences that are morphologically valid are selected. . . . . . . 9 / 48

  17. Compound Segmenter The task of a segmenter is to split a given sequence of phonemes into a sequence of morphologically valid segments. The compound formation involves a mandatory sandhi. Each sandhi rule is a triple (x, y, z) where y is the last letter of the first primitive, z is the first letter of the second primitive, and x is the letter sequence resulting from the euphonic combination. For analysis, we reverse these rules of sandhi and produce y + z corresponding to a x . Only the sequences that are morphologically valid are selected. We follow GENerate-CONstrain-EVALuate paradigm attributed to the Optimality Theory for segmentation. . . . . . . 9 / 48

  18. Compound Segmenter The task of a segmenter is to split a given sequence of phonemes into a sequence of morphologically valid segments. The compound formation involves a mandatory sandhi. Each sandhi rule is a triple (x, y, z) where y is the last letter of the first primitive, z is the first letter of the second primitive, and x is the letter sequence resulting from the euphonic combination. For analysis, we reverse these rules of sandhi and produce y + z corresponding to a x . Only the sequences that are morphologically valid are selected. We follow GENerate-CONstrain-EVALuate paradigm attributed to the Optimality Theory for segmentation. The Optimality Theory basically addresses the issue of generation. . . . . . . 9 / 48

  19. . . . . . . . . . . . . . . . Flow-chart represetation of Compound Segmentation . . . . . . 10 / 48

  20. . . . . . . . . . . . . Flow-chart represetation of Compound Segmentation The basic outline of the algorithm is: . . . 1 Recursively break a word at every possible position applying a sandhi rule and generate all possible candidates for the input. (17 segments) . . . . . . 10 / 48

  21. . . . . . . . . . Flow-chart represetation of Compound Segmentation The basic outline of the algorithm is: . . . 1 Recursively break a word at every possible position applying a sandhi rule and generate all possible candidates for the input. (17 segments) . . . 2 Pass the constituents of all the candidates through the morph analyser. . . . . . . 10 / 48

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend