syntactic ngrams over time from a very large corpus of
play

Syntactic-Ngrams over Time from a Very Large Corpus of English - PowerPoint PPT Presentation

Syntactic-Ngrams over Time from a Very Large Corpus of English Books Yoav Goldberg and Jon Orwant Presented at *SEM 2013, Atlanta, GA Many thanks to Google's parsing team Ryan Dipanjan Slav Kuzman Keith Terry Michael Fernando Hao


  1. Functional modifiers of coordinated nouns conj conj cc the boy and the girl

  2. Functional modifiers of coordinated nouns conj conj cc ___ */NN and ___ */NN

  3. Functional modifiers of coordinated nouns conj conj cc ___ */NN and ___ */NN parallelism?

  4. 79250839 the and the Functional modifiers of 15031401 a and a coordinated nouns 3820439 the and its 2614562 the and his 2467965 his and his 2242856 a and the conj conj 2133545 the and a cc 2030446 the and their ___ */NN and ___ */NN 1856827 an and a 1686133 a and an 1020169 their and their parallelism? 892783 his and the 750079 my and my 714221 her and her 658563 its and its 475910 an and an 467310 our and our 459989 the and her

  5. 79250839 the and the Functional modifiers of 15031401 a and a coordinated nouns 3820439 the and its 2614562 the and his 2467965 his and his 2242856 a and the conj conj 2133545 the and a cc 2030446 the and their ___ */NN and ___ */NN 1856827 an and a 1686133 a and an 1020169 their and their parallelism? 892783 his and the 750079 my and my 714221 her and her 658563 its and its 475910 an and an 467310 our and our 459989 the and her

  6. biarcs: three content words

  7. biarcs: three content words dobj root amod conserve scarce resources

  8. biarcs: three content words root nsubj dobj farmers conserve resources

  9. biarcs: three content words ccomp rcmod dobj conserve habits possessed

  10. biarcs: three content words ccomp pobj dobj prep conserve wildlife by leaving

  11. biarcs: three content words ccomp pobj dobj prep conserve wildlife by leaving

  12. biarcs: three content words conj xcomp dobj cc conserve oil and gas

  13. biarcs: three content words root ccomp xcomp describes feeling attracted

  14. biarcs: three content words capture interactions between subject, verb and object

  15. biarcs: three content words capture interactions between two adjectives of a noun

  16. biarcs: three content words capture interactions between verb, adverb and subject

  17. biarcs: three content words VSM's not covered by “arcs” dataset are probably covered by this one

  18. biarcs: three content words second-order questions

  19. adjectives of things that eat nsubj amod ___ * ate old, young, little, other, most, many, first, poor, whole, white, ancient, average, obese, few, hungry, primitive, native, condemned, human, large, wild, black, great, small, starving, american, neotropical, rich, entire, ordinary, pregnant, thin, lean, normal, prehistoric, overweight, elder, fat, grave, wicked, local, holy, wealthy, working, unfortunate, miserable, sick, indian, cannibalistic, indigenous, savage, persian, maori, southern, primate, female, aboriginal, skinny, austrelian, ...

  20. adjectives of things being eaten dobj amod ate ___ * last, little, good, same, hearty, more, cold, whole, few, large, much, raw, small, great, hot, human, many, own, first, boiled, only, forbidden, big, other, light, simple, lobe, wild, fresh, green, roast, sweet, several, huge, delicious, quick, enormous, late, boiled, dry, white, frugal, early, next, fried, hasty, different, black, dried, red, fried, stale, canned, chinese, sour, cooked, french, vegetarian, mexican, baked, wonderful, poisoned, scrambled, roasted, enough, broiled, soft, kosher, ...

  21. triarcs: four content words

  22. triarcs: four content words pobj root prep pobj prep amod consist of group of short fibers

  23. triarcs: four content words pobj root prep pobj prep amod consist of group of short fibers

  24. triarcs: four content words prep pobj root advmod amod consist principally of heavier hydrocarbons

  25. triarcs: four content words advcl root dobj nsubj consist vessel crosses spine

  26. triarcs: four content words nsubj root advmod rcmod social situation exposed consisted

  27. triarcs: four content words conj pobj amod amod cc tiny baby and small child

  28. Adjectivial modifiers of coordinated nouns conj conj amod cc amod ___ */NN and ___ */NN parallelism?

  29. 347380 late and early Adjectivial modifiers of 318353 new and new coordinated nouns 143298 good and good 123184 high and low 119851 social and social 87337 high and high conj conj 83516 % and % amod cc amod 82964 human and human ___ */NN and ___ */NN 78980 low and high 74488 different different 72617 same and same parallelism!! 68260 great and great 67055 good and bad 62282 many and many 61822 other and other 61126 own and own 58781 more and more 57556 young and young 57392 black and white 54690 white and black

  30. quadarcs: five content words (but restricted to specific patterns)

  31. quadarcs: five content words (but restricted to specific patterns) nsubj dobj pobj prep num parts of compilation constitute one work

  32. quadarcs: five content words (but restricted to specific patterns) conj amod pobj nsubjpass prep cc consecrated emblems distinguished by materials and workmanship

  33. quadarcs: five content words (but restricted to specific patterns) A content-word root, with two chains of two content-words each

  34. There are also the extended versions (with functional markers) of biarcs, triarcs and quadarcs

  35. nounargs: noun and all its modifiers (+ all functional markers)

  36. nounargs: noun and all its modifiers (+ all functional markers) det amod a gradual decrease

  37. nounargs: noun and all its modifiers (+ all functional markers) det amod amod an exponential gradual decrease

  38. nounargs: noun and all its modifiers (+ all functional markers) pobj det prep det a decrease in the dimension

  39. nounargs: noun and all its modifiers (+ all functional markers) det pobj amod prep det a corresponding decrease in the quantity

  40. nounargs: noun and all its modifiers (+ all functional markers) can be used for estimating dependency language models

  41. nounargs: noun and all its modifiers (+ all functional markers) other interesting questions: PP co-occurrence patterns? adjectivial co-occurrence? definiteness patterns?

  42. verbargs: verb and all its modifiers (+ all functional markers)

  43. verbargs: verb and all its modifiers (+ all functional markers) dobj nsubj det he detected a possibility

  44. verbargs: verb and all its modifiers (+ all functional markers) dobj nsubj conj det cc he detected and exploited the impulses

  45. verbargs: verb and all its modifiers (+ all functional markers) pobj nsubj dobj prep det he detected confusion in the tone

  46. verbargs: verb and all its modifiers (+ all functional markers) nsubjpass aux aux he has been detected

  47. verbargs: verb and all its modifiers (+ all functional markers) advmod prep advmod often unfairly dealt with only briefly dealt with only marginally dealt with otherwise severely dealt with rather hardly dealt with so directly dealt with so easily dealt with

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend