in5550 neural methods in natural language processing home
play

IN5550 Neural Methods in Natural Language Processing Home Exam: - PowerPoint PPT Presentation

IN5550 Neural Methods in Natural Language Processing Home Exam: Task Overview and Kick-Off Stephan Oepen, Lilja vrelid, & Erik Velldal University of Oslo April 21, 2020 Home Exam General Idea Use as guiding metaphor:


  1. – IN5550 – Neural Methods in Natural Language Processing Home Exam: Task Overview and Kick-Off Stephan Oepen, Lilja Øvrelid, & Erik Velldal University of Oslo April 21, 2020

  2. Home Exam General Idea ◮ Use as guiding metaphor: Preparing a scientific paper for publication. 2

  3. Home Exam General Idea ◮ Use as guiding metaphor: Preparing a scientific paper for publication. Second IN5550 Teaching Workshop on Neural NLP (WNNLP 2020) 2

  4. Home Exam General Idea ◮ Use as guiding metaphor: Preparing a scientific paper for publication. Second IN5550 Teaching Workshop on Neural NLP (WNNLP 2020) Standard Process (1) Experimentation (2) Analysis (3) Paper Submission (4) Reviewing (5) Camera-Ready Manuscript (6) Presentation 2

  5. Home Exam General Idea ◮ Use as guiding metaphor: Preparing a scientific paper for publication. Second IN5550 Teaching Workshop on Neural NLP (WNNLP 2020) Standard Process (0) Problem Statement (1) Experimentation (2) Analysis (3) Paper Submission (4) Reviewing (5) Camera-Ready Manuscript (6) Presentation 2

  6. For Example: The ACL 2020 Conference 3

  7. WNNLP 2020: Call for Papers and Important Dates General Constraints ◮ Three specialized tracks: NER, Negation Scope, Sentiment Analysis. ◮ Long papers: up to nine pages, excluding references, in ACL 2020 style. ◮ Submitted papers must be anonymous: peer reviewing is double-blind. ◮ Replicability: Submission backed by code repository (area chairs only). Schedule By April 22 Declare choice of track (and team composition) April 28 Per-track mentoring sessions with Area Chairs Early May Individual supervisory meetings (upon request) May 12 (Strict) Submission deadline for scientific papers May 13–18 Reviewing period: Each student reviews two papers May 20 Area Chairs make and announce acceptance decisions May 25 Camera-ready manuscripts due, with requested revisions May 27 Oral presentations and awards at the workshop 4

  8. The Central Authority for All Things WNNLP 2020 https://www.uio.no/studier/emner/matnat/ifi/IN5550/v20/exam.html 5

  9. WNNLP 2020: What Makes a Good Scientific Paper? Empirical (Experimental) ◮ Motivate architecture choice(s) and hyper-parameters; ◮ systematic exploration of relevant parameter space; ◮ comparison to reasonable baseline or previous work. 6

  10. WNNLP 2020: What Makes a Good Scientific Paper? Empirical (Experimental) ◮ Motivate architecture choice(s) and hyper-parameters; ◮ systematic exploration of relevant parameter space; ◮ comparison to reasonable baseline or previous work. Replicable (Reproducible) ◮ Everything relevant to run and reproduce in M$ GitHub. 6

  11. WNNLP 2020: What Makes a Good Scientific Paper? Empirical (Experimental) ◮ Motivate architecture choice(s) and hyper-parameters; ◮ systematic exploration of relevant parameter space; ◮ comparison to reasonable baseline or previous work. Replicable (Reproducible) ◮ Everything relevant to run and reproduce in M$ GitHub. Analytical (Reflective) ◮ Identify and relate to previous work; ◮ explain choice of baseline or points of comparison; ◮ meaningful, precise discussion of results; ◮ ‘negative’ results can be interesting too; ◮ look at the data: discuss some examples: ◮ error analysis: identify remaining challenges. 6

  12. WNNLP 2020: Programme Committee General Chair ◮ Stephan Oepen Area Chairs ◮ Named Entity Recognition: Erik Velldal ◮ Negation Scope: Stephan Oepen ◮ Sentiment Analysis: Lilja Øvrelid & Jeremy Barnes Peer Reviewers ◮ All students who have submitted a scientific paper 7

  13. Track 1: Named Entity Recognition ◮ NER: The task of identifying and categorizing proper names in text. ◮ Typical categories: persons, organizations, locations, geo-political entities, products, events, etc. ◮ Example from NorNE which is the corpus we will be using: ORG GPE_LOC Den internasjonale domstolen har sete i Haag . The International Court of Justice has its seat in The Hague . 8

  14. Class labels ◮ Abstractly a sequence segmentation task, ◮ but in practice solved as a sequence labeling problem, ◮ assigning per-word labels according to some variant of the BIO scheme B-ORG I-ORG I-ORG O O O B-GPE_LOC O Den internasjonale domstolen har sete i Haag . 9

  15. NorNE ◮ First publicly available NER dataset for Norwegian; joint effort between LTG, Schibsted and Språkbanken (the National Library). ◮ Named entity annotations added to NDT for both Bokmål and Nynorsk: ◮ ∼ 300 K tokens for each, of which ∼ 20 K form part of a NE. ◮ Distributed in the CoNLL-U format using the BIO labeling scheme. Simplified version: 1 Den den DET name=B-ORG 2 internasjonale internasjonal ADJ name=I-ORG 3 domstolen domstol NOUN name=I-ORG 4 har ha VERB name=O 5 sete sete NOUN name=O 6 i i ADP name=O 7 Haag Haag PROPN name=B-GPE_LOC 8 . $. PUNCT name=O 10

  16. NorNE entity types (Bokmål) Type Train Dev Test Total PER 4033 607 560 5200 2828 400 283 3511 ORG GPE_LOC 2132 258 257 2647 671 162 71 904 PROD LOC 613 109 103 825 388 55 50 493 GPE_ORG 519 77 48 644 DRV 131 9 5 145 EVT 8 0 0 0 MISC https://github.com/ltgoslo/norne/ 11

  17. Evaluating NER ◮ While NER can be evaluated by P, R and F1 at the token-level, ◮ evaluating on the entity-level can be more informative. ◮ Several ways to do this (wording from SemEval 2013 task 9.1 in parens): ◮ Exact labeled (‘strict’): The gold annotation and the system output is identical; both the predicted boundary and entity label is correct. ◮ Partial labeled (‘type’): Correct label and at least a partial boundary match. ◮ Exact unlabeled (‘exact’): Correct boundary, disregarding the label. ◮ Partial unlabeled (‘partial’): At least a partial boundary match, disregarding the label. ◮ https://github.com/davidsbatista/NER-Evaluation 12

  18. NER model ◮ Current go-to model for NER: a BiLSTM with a CRF inference layer, ◮ possibly with a max-pooled character-level CNN feeding into the BiLSTM together with pre-trained word embeddings. (Image: Jie Yang & Yue Zhang 2018: NCRF++: An Open-source Neural Sequence Labeling Toolkit ) 13

  19. Suggested reading on neural seq. modeling ◮ Jie Yang, Shuailong Liang, & Yue Zhang, 2018 Design Challenges and Misconceptions in Neural Sequence Labeling (Best Paper Award at COLING 2018) https://aclweb.org/anthology/C18-1327 ◮ Nils Reimers & Iryna Gurevych, 2017 Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks https://arxiv.org/pdf/1707.06799.pdf State-of-the-art leaderboards for NER ◮ https://nlpprogress.com/english/named_entity_recognition.html ◮ https://paperswithcode.com/task/named-entity-recognition-ner 14

  20. More information about the dataset ◮ https://github.com/ltgoslo/norne ◮ F. Jørgensen, T. Aasmoe, A.S. Ruud Husevåg, L. Øvrelid and E. Velldal NorNE: Annotating Named Entities for Norwegian Proceedings of the 12th Edition of its Language Resources and Evaluation Conference, Marseille, France, 2020 https://arxiv.org/pdf/1911.12146.pdf 15

  21. Some suggestions to get started with experimentation ◮ Different label encodings BIO-1 / BIO-2 / BIOES etc. ◮ Different label set granularities: ◮ 8 entity types in NorNE by default ( MISC can be ignored) ◮ Could be reduced to 7 by collapsing GPE_LOC and GPE_ORG to GPE , or to 6 by mapping them to LOC and ORG . ◮ Impact of different parts of the architecture: ◮ CRF vs softmax ◮ Impact of including a character-level model (e.g. CNN or RNN). Tip: evaluate effect for OOVs. ◮ Adding several BiLSTM layers ◮ Do different evaluation strategies give different relative rankings of different systems? ◮ Compute learning curves ◮ Mixing Bokmål / Nynorsk? Machine-translation? ◮ Impact of embedding pre-training (corpus, dim., framework, etc) 16 ◮ Possibilities for transfer / multi-task learning?

  22. Track 2: Negation Scope Non-Factuality (and Uncertainty) Very Common in Language But { this theory would } � not � { work } . I think, Watson, { a brandy and soda would do him } � no � { harm } . They were all confederates in { the same } � un �{ known crime } . “Found dead � without � { a mark upon him } . 17

  23. Track 2: Negation Scope Non-Factuality (and Uncertainty) Very Common in Language But { this theory would } � not � { work } . I think, Watson, { a brandy and soda would do him } � no � { harm } . They were all confederates in { the same } � un �{ known crime } . “Found dead � without � { a mark upon him } . { We have } � never � { gone out � without � { keeping a sharp watch }} , and � no � { one could have escaped our notice } .” 17

  24. Track 2: Negation Scope Non-Factuality (and Uncertainty) Very Common in Language But { this theory would } � not � { work } . I think, Watson, { a brandy and soda would do him } � no � { harm } . They were all confederates in { the same } � un �{ known crime } . “Found dead � without � { a mark upon him } . { We have } � never � { gone out � without � { keeping a sharp watch }} , and � no � { one could have escaped our notice } .” Phorbol activation was positively modulated by Ca2+ influx while { TNF alpha activation was } � not � . CoNLL 2010, *SEM 2012, and EPE 2017 International Shared Tasks ◮ Bake-off: Standardized training and test data, evaluation, schedule; ◮ 20 + participants; LTG systems top performers throughout the years. 17

  25. Small Words Can Make a Large Difference 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend