parser evaluation and the bnc
play

Parser Evaluation and the BNC Standard Parser Evaluation The - PowerPoint PPT Presentation

Parser Evaluation and the BNC Jennifer Foster and Josef van Genabith BNC Gold Parser Evaluation and the BNC Standard Parser Evaluation The Parsers Jennifer Foster and Josef van Genabith The Metrics Evaluation Results National Centre


  1. Parser Evaluation and the BNC Jennifer Foster and Josef van Genabith BNC Gold Parser Evaluation and the BNC Standard Parser Evaluation The Parsers Jennifer Foster and Josef van Genabith The Metrics Evaluation Results National Centre for Language Technology School of Computing Dublin City University 29th May 2008

  2. Parser What is this work about? Evaluation and the BNC Jennifer Foster and Josef van Genabith BNC Gold Standard Parser Evaluation 1. Creating a set of gold standard parse trees for The Parsers The Metrics 1,000 sentences from the BNC Evaluation Results 2. Using these trees as a test set to evaluate various parsers

  3. Parser Outline Evaluation and the BNC Jennifer Foster and Josef van Genabith BNC Gold Standard BNC Gold Standard Parser Evaluation The Parsers The Metrics Evaluation Results Parser Evaluation The Parsers The Metrics Evaluation Results

  4. Parser The British National Corpus Evaluation and the BNC Jennifer Foster and Josef van Genabith The BNC is a one hundred million word BNC Gold balanced corpus of British English (Burnard, Standard Parser 2000) Evaluation The Parsers The Metrics ◮ 90% of the BNC is written text Evaluation Results ◮ 75% factual ◮ 25% fiction ◮ The 10% spoken component consists of ◮ informal dialogue ◮ business meetings ◮ speeches

  5. Parser The British National Corpus Evaluation and the BNC Jennifer Foster and Josef van Genabith The BNC is a one hundred million word BNC Gold balanced corpus of British English (Burnard, Standard Parser 2000) Evaluation The Parsers The Metrics ◮ 90% of the BNC is written text Evaluation Results ◮ 75% factual ◮ 25% fiction ◮ The 10% spoken component consists of ◮ informal dialogue ◮ business meetings ◮ speeches

  6. Parser The British National Corpus Evaluation and the BNC Jennifer Foster and Josef van Genabith The BNC is a one hundred million word BNC Gold balanced corpus of British English (Burnard, Standard Parser 2000) Evaluation The Parsers The Metrics ◮ 90% of the BNC is written text Evaluation Results ◮ 75% factual ◮ 25% fiction ◮ The 10% spoken component consists of ◮ informal dialogue ◮ business meetings ◮ speeches

  7. Parser BNC Test Set: Choosing the sentences Evaluation and the BNC Jennifer Foster and Josef van 1,000 sentences in test set Genabith BNC Gold Standard Parser ◮ Not chosen completely at random Evaluation The Parsers ◮ They are different from WSJ training data: The Metrics Evaluation ◮ Contain a verb in BNC but not in WSJ2-21 Results ◮ 25,874 verb lemmas in BNC but not in WSJ2-21 ◮ 14,787 occur only once in BNC (e.g. jitter, unfade, transpersonalize, kerplonk ) ◮ 537 occur greater than 100 times (e.g. murmur, frown, damn ) ◮ Likely to represent a difficult test for WSJ-trained parsers

  8. Parser BNC Test Set: Choosing the sentences Evaluation and the BNC Jennifer Foster and Josef van 1,000 sentences in test set Genabith BNC Gold Standard Parser ◮ Not chosen completely at random Evaluation The Parsers ◮ They are different from WSJ training data: The Metrics Evaluation ◮ Contain a verb in BNC but not in WSJ2-21 Results ◮ 25,874 verb lemmas in BNC but not in WSJ2-21 ◮ 14,787 occur only once in BNC (e.g. jitter, unfade, transpersonalize, kerplonk ) ◮ 537 occur greater than 100 times (e.g. murmur, frown, damn ) ◮ Likely to represent a difficult test for WSJ-trained parsers

  9. Parser BNC Test Set: Choosing the sentences Evaluation and the BNC Jennifer Foster and Josef van 1,000 sentences in test set Genabith BNC Gold Standard Parser ◮ Not chosen completely at random Evaluation The Parsers ◮ They are different from WSJ training data: The Metrics Evaluation ◮ Contain a verb in BNC but not in WSJ2-21 Results ◮ 25,874 verb lemmas in BNC but not in WSJ2-21 ◮ 14,787 occur only once in BNC (e.g. jitter, unfade, transpersonalize, kerplonk ) ◮ 537 occur greater than 100 times (e.g. murmur, frown, damn ) ◮ Likely to represent a difficult test for WSJ-trained parsers

  10. Parser BNC Test Set: Choosing the sentences Evaluation and the BNC Jennifer Foster and Josef van 1,000 sentences in test set Genabith BNC Gold Standard Parser ◮ Not chosen completely at random Evaluation The Parsers ◮ They are different from WSJ training data: The Metrics Evaluation ◮ Contain a verb in BNC but not in WSJ2-21 Results ◮ 25,874 verb lemmas in BNC but not in WSJ2-21 ◮ 14,787 occur only once in BNC (e.g. jitter, unfade, transpersonalize, kerplonk ) ◮ 537 occur greater than 100 times (e.g. murmur, frown, damn ) ◮ Likely to represent a difficult test for WSJ-trained parsers

  11. Parser BNC Test Set: Choosing the sentences Evaluation and the BNC Jennifer Foster and Josef van 1,000 sentences in test set Genabith BNC Gold Standard Parser ◮ Not chosen completely at random Evaluation The Parsers ◮ They are different from WSJ training data: The Metrics Evaluation ◮ Contain a verb in BNC but not in WSJ2-21 Results ◮ 25,874 verb lemmas in BNC but not in WSJ2-21 ◮ 14,787 occur only once in BNC (e.g. jitter, unfade, transpersonalize, kerplonk ) ◮ 537 occur greater than 100 times (e.g. murmur, frown, damn ) ◮ Likely to represent a difficult test for WSJ-trained parsers

  12. Parser BNC Test Set: Choosing the sentences Evaluation and the BNC Jennifer Foster and Josef van 1,000 sentences in test set Genabith BNC Gold Standard Parser ◮ Not chosen completely at random Evaluation The Parsers ◮ They are different from WSJ training data: The Metrics Evaluation ◮ Contain a verb in BNC but not in WSJ2-21 Results ◮ 25,874 verb lemmas in BNC but not in WSJ2-21 ◮ 14,787 occur only once in BNC (e.g. jitter, unfade, transpersonalize, kerplonk ) ◮ 537 occur greater than 100 times (e.g. murmur, frown, damn ) ◮ Likely to represent a difficult test for WSJ-trained parsers

  13. Parser BNC Test Set: Some examples Evaluation and the BNC Jennifer Foster and Josef van Genabith BNC Gold Text Type # Example Standard Spoken 10 The seconder of formally seconded Parser Evaluation Poem 9 Groggily somersaulting to get air- The Parsers borne The Metrics Evaluation Results Caption 4 Community Personified Headline 2 Drunk priest is nicked driving to a fu- neral Average sentence length: 28 words

  14. Parser BNC Test Set: Some examples Evaluation and the BNC Jennifer Foster and Josef van Genabith BNC Gold Text Type # Example Standard Spoken 10 The seconder of formally seconded Parser Evaluation Poem 9 Groggily somersaulting to get air- The Parsers borne The Metrics Evaluation Results Caption 4 Community Personified Headline 2 Drunk priest is nicked driving to a fu- neral Average sentence length: 28 words

  15. Parser BNC Test Set: Annotation Process Evaluation and the BNC Jennifer Foster and Josef van Genabith BNC Gold ◮ One annotator Standard Parser ◮ Two passes through the data Evaluation The Parsers ◮ Approximately 100 hours The Metrics Evaluation Results ◮ As references, the annotator used 1. Penn Treebank bracketing guidelines (Bies et al 1995) 2. Penn Treebank itself ◮ Functional tags and traces not annotated

  16. Parser BNC Test Set: Annotation Difficulties Evaluation and the BNC Jennifer Foster and Josef van What happens when the references clash? Genabith BNC Gold Standard ◮ The noun phrase almost certain death occurs in Parser Evaluation BNC gold standard sentence The Parsers The Metrics Evaluation ◮ According to the guidelines, it should be Results annotated as (NP (ADJP almost certain) death) ◮ A search for almost in the Penn Treebank yields the following example (NP almost unimaginable speed) ◮ In such cases, annotator chose the analysis set out in the guidelines

  17. Parser BNC Test Set: Annotation Difficulties Evaluation and the BNC Jennifer Foster and Josef van What happens when the references clash? Genabith BNC Gold Standard ◮ The noun phrase almost certain death occurs in Parser Evaluation BNC gold standard sentence The Parsers The Metrics Evaluation ◮ According to the guidelines, it should be Results annotated as (NP (ADJP almost certain) death) ◮ A search for almost in the Penn Treebank yields the following example (NP almost unimaginable speed) ◮ In such cases, annotator chose the analysis set out in the guidelines

  18. Parser BNC Test Set: Annotation Difficulties Evaluation and the BNC Jennifer Foster and Josef van What happens when the references clash? Genabith BNC Gold Standard ◮ The noun phrase almost certain death occurs in Parser Evaluation BNC gold standard sentence The Parsers The Metrics Evaluation ◮ According to the guidelines, it should be Results annotated as (NP (ADJP almost certain) death) ◮ A search for almost in the Penn Treebank yields the following example (NP almost unimaginable speed) ◮ In such cases, annotator chose the analysis set out in the guidelines

  19. Parser BNC Test Set: Annotation Difficulties Evaluation and the BNC Jennifer Foster and Josef van What happens when the references clash? Genabith BNC Gold Standard ◮ The noun phrase almost certain death occurs in Parser Evaluation BNC gold standard sentence The Parsers The Metrics Evaluation ◮ According to the guidelines, it should be Results annotated as (NP (ADJP almost certain) death) ◮ A search for almost in the Penn Treebank yields the following example (NP almost unimaginable speed) ◮ In such cases, annotator chose the analysis set out in the guidelines

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend