linguists get the abstraction machines get the details
play

Linguists get the abstraction, machines get the details Hal Daum - PowerPoint PPT Presentation

Hal Daum III (me@hal3.name) Linguists get the abstraction, machines get the details Hal Daum III Computer Science / Linguistics University of Maryland, College Park me@hal3.name Symbol Pushing Slide 1 Hal Daum III (me@hal3.name)


  1. Hal Daumé III (me@hal3.name) Linguists get the abstraction, machines get the details Hal Daumé III Computer Science / Linguistics University of Maryland, College Park me@hal3.name Symbol Pushing Slide 1

  2. Hal Daumé III (me@hal3.name) NLP's use of linguists, a caricature Linguists develop theory ➢ Linguists richly annotate data (eg treebank) ➢ NLP people train systems (eg parser) ➢ Parser output fed into machine translation ➢ system Machine translation system has no idea what the ➢ input symbols mean NP, VP, VBD, .... might as well be X1, X2, X3, … ➢ Symbol Pushing Slide 2

  3. Hal Daumé III (me@hal3.name) Where does this model work? Works when entire pipeline is learned from data ➢ And we make no use of prior knowledge ➢ Where does this model not work? Pretty much any other time ➢ Symbol Pushing Slide 3

  4. Hal Daumé III (me@hal3.name) Inferring Tags from the Structure ➢ INPUT: The man ate a big sandwich ➢ OUTPUT: D N V D J N ➢ Baseline: ➢ Random guessing: 4% accuracy Symbol Pushing Slide 4

  5. Hal Daumé III (me@hal3.name) Sources of Knowledge ➢ Seeds (frequent words for each tag) ➢ N: membro, milhoes, obras ➢ D: as [the,2f] o [the,1m] os [the,2m] ➢ V: afector, gasta, juntar ➢ P: com, como, de, em ➢ Typological rules: ➢ Art ← Noun ➢ Prp → Noun ➢ Tag knowledge: ➢ Open class ➢ Closed class Symbol Pushing Slide 5

  6. Hal Daumé III (me@hal3.name) Preliminary Results 60 50 40 No O/C 30 Open/Close d 20 10 0 No Seeds Seeds Symbol Pushing Slide 6

  7. Hal Daumé III (me@hal3.name) Preliminary Results: Open/Closed NO SEEDS SEEDS 60 60 55 55 50 50 45 45 40 40 35 35 30 30 25 25 20 20 Art<-N Both Art<-N Both No Rules Prp->N No Rules Prp->N Symbol Pushing Slide 7

  8. Hal Daumé III (me@hal3.name) I'd like NLP to use more linguistics, but... Linguistic models are often developed without ➢ any reference to computation Many NLP students do not learn (or appreciate) ➢ much beyond other than Syntax I Linguistic theories seem to be good in the ➢ abstract, but (perhaps) not so much in the details Symbol Pushing Slide 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend