Simple Morpheme Labelling in Unsupervised Morpheme Analysis
Delphine Bernhard
Ubiquitous Knowledge Processing Lab, Darmstadt, Germany
Morpho Challenge 2007 – September 19, 2007
1 / 23
Simple Morpheme Labelling in Unsupervised Morpheme Analysis - - PowerPoint PPT Presentation
Simple Morpheme Labelling in Unsupervised Morpheme Analysis Delphine Bernhard Ubiquitous Knowledge Processing Lab, Darmstadt, Germany Morpho Challenge 2007 September 19, 2007 1 / 23 Main features of the method Algorithm already
1 / 23
◮ prefix: dis arm ed ◮ suffix: sulk ing ◮ stem: grow ◮ linking element: oil – painting s 2 / 23
List of prefixes and suffixes List of stems Step 4: Selection of the best segmentation Potential segmentations for each word List of word forms Step 1: Extraction of prefixes and suffixes Step 2: Acquisition of stems Step 3: Segmentation of words Segmented words Morphemic segments Step 5 (optional): Application of the segments to a new data set Additional word forms
3 / 23
List of prefixes and suffixes List of stems Step 4: Selection of the best segmentation Potential segmentations for each word List of word forms Step 1: Extraction of prefixes and suffixes Step 2: Acquisition of stems Step 3: Segmentation of words Segmented words Morphemic segments Step 5 (optional): Application of the segments to a new data set Additional word forms
4 / 23
5 / 23
5 / 23
5 / 23
6 / 23
List of prefixes and suffixes List of stems Step 4: Selection of the best segmentation Potential segmentations for each word List of word forms Step 1: Extraction of prefixes and suffixes Step 2: Acquisition of stems Step 3: Segmentation of words Segmented words Morphemic segments Step 5 (optional): Application of the segments to a new data set Additional word forms
7 / 23
8 / 23
List of prefixes and suffixes List of stems Step 4: Selection of the best segmentation Potential segmentations for each word List of word forms Step 1: Extraction of prefixes and suffixes Step 2: Acquisition of stems Step 3: Segmentation of words Segmented words Morphemic segments Step 5 (optional): Application of the segments to a new data set Additional word forms
9 / 23
10 / 23
11 / 23
List of prefixes and suffixes List of stems Step 4: Selection of the best segmentation Potential segmentations for each word List of word forms Step 1: Extraction of prefixes and suffixes Step 2: Acquisition of stems Step 3: Segmentation of words Segmented words Morphemic segments Step 5 (optional): Application of the segments to a new data set Additional word forms
12 / 23
13 / 23
List of prefixes and suffixes List of stems Step 4: Selection of the best segmentation Potential segmentations for each word List of word forms Step 1: Extraction of prefixes and suffixes Step 2: Acquisition of stems Step 3: Segmentation of words Segmented words Morphemic segments Step 5 (optional): Application of the segments to a new data set Additional word forms
14 / 23
◮ Method 1:
◮ Method 2:
◮ si = morphemic segment ◮ f(si) = frequency of segment si 15 / 23
72.0 76.0 63.2 78.2 61.6 59.6 49.1 73.7
16 / 23
52.5 25.0 37.7 10.9 60.0 40.4 57.4 14.8
17 / 23
60.7 37.6 47.2 19.2 60.8 48.2 52.9 24.6
18 / 23
27.8 35.6 35.0 27.8 40.2 37.8 27.8 39.0 37.2 26.7 39.8 37.3 26.8 38.1 37.0
19 / 23
31.2 32.7 32.3 38.8 41.8 46.1 39.0 46.8 47.3 39.2 44.2 46.8 39.4 49.1 46.2
20 / 23
21 / 23
◮ allomorphy: different forms for the same morpheme
◮ homography: same form for different morphemes
21 / 23
◮ allomorphy: different forms for the same morpheme
◮ homography: same form for different morphemes
21 / 23
◮ allomorphy: different forms for the same morpheme
◮ homography: same form for different morphemes
21 / 23
◮ allomorphy: different forms for the same morpheme
◮ homography: same form for different morphemes
21 / 23
22 / 23
23 / 23