introduction to machine learning
play

Introduction to Machine Learning NPFL 054 - PowerPoint PPT Presentation

Introduction to Machine Learning NPFL 054 http://ufal.mff.cuni.cz/course/npfl054 Barbora Hladk Martin Holub {Hladka | Holub} @ufal.mff.cuni.cz Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and


  1. Introduction to Machine Learning NPFL 054 http://ufal.mff.cuni.cz/course/npfl054 Barbora Hladká Martin Holub {Hladka | Holub} @ufal.mff.cuni.cz Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 1/12

  2. Demo 1 Verb Patterns Classification Purpose of the demo task = to show several things related to gold standard data for a supervised machine learning task, especially • Manual annotation and basic data analysis • Gold Standard data distribution • Inter-annotator agreement • Confusion matrices • Error analysis NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 2/12

  3. Demo 1 Verb Patterns Classification Annotation experiment 2015 NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 3/12

  4. Gold Standard – data distributions Cry Enlarge GS histogram GS histogram 250 140 120 200 100 150 80 60 100 40 50 20 0 0 u x 1 4 7 u 1 2 3 4 NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 4/12

  5. Manual annotation • Annotated data • 6 x 10 sentences with CRY • 6 x 10 sentences with ENLARGE • 4 groups A, B, C, D • the same data set annotated by each group • We analyse • which group is closer to the Gold Standard • the inter-annotator agreement between groups NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 5/12

  6. A, B, C, D distributions - CRY Cry A Cry B 30 30 20 10 10 0 0 u x 1 4 7 u x 1 4 7 Cry C Cry D 30 25 20 10 10 0 0 u x 1 4 7 u x 1 4 7 NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 6/12

  7. A, B, C, D distributions - ENLARGE Enlarge A Enlarge B 30 30 20 20 10 10 0 0 u 1 2 3 4 u 1 2 3 4 Enlarge C Enlarge D 30 20 10 10 0 0 u 1 2 3 4 u 1 2 3 4 NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 7/12

  8. Confusion matrices - CRY group agreement (%) disagreement (%) A 41 (68,3) 19 (31,7) B 40 (66,7) 20 (33,3) C 40 (66,7) 20 (33,3) D 45 (75,0) 15 (25,0) NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 8/12

  9. Confusion matrices - ENLARGE group agreement (%) disagreement (%) A 38 (63,3) 22 (36,7) B 28 (46,7) 32 (53,3) C 28 (46,7) 32 (53,3) D 36 (60,0) 24 (40,0) NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 9/12

  10. Inter-annotator agreement (IAA) groups verb Cohen’s Kappa Fleiss’s Kappa A-B cry 0.355 A-C cry 0.276 A-D cry 0.406 B-C cry 0.366 B-D cry 0.407 C-D cry 0.327 A-B-C-D cry – 0.353 A-B enlarge 0.306 A-C enlarge 0.413 A-D enlarge 0.296 B-C enlarge 0.222 B-D enlarge 0.324 C-D enlarge 0.365 A-B-C-D enlarge – 0.319 NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 10/12

  11. A + B + C + D vs. GS – CRY • Number of agreements: 171 (71.3 %) • Number of disagreements: 69 (28.7 %) NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 11/12

  12. A + B + C + D vs. GS – ENLARGE • Number of agreements: 138 (57.5 %) • Number of disagreements: 102 (42.5 %) NPFL054, 2015 Hladká & Holub & Lukšová Demo 1, page 12/12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend