Squashing Computational Linguistics Noah A. Smith Paul G. Allen - PowerPoint PPT Presentation

Squashing Computational Linguistics Noah A. Smith Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, USA @nlpnoah Research supported in part by: NSF, DARPA DEFT, DARPA CWC, Facebook, Google, Samsung, University of Washington.

Applications of NLP in 2017 • Conversation, IE, MT, QA, summarization, text categorization

Applications of NLP in 2017 • Conversation, IE, MT, QA, summarization, text categorization • Machine-in-the-loop tools for (human) authors Chenhao Tan Elizabeth Clark Revise your Collaborate with an NLP message with help model through an from NLP “exquisite corpse” storytelling game tremoloop.com

Applications of NLP in 2017 • Conversation, IE, MT, QA, summarization, text categorization • Machine-in-the-loop tools for (human) authors • Analysis tools for measuring social phenomena Lucy Lin Dallas Card Sensationalism in Track ideas, propositions, science news frames in discourse over time bit.ly/sensational-news … bookmark this survey!

data ?

Squash

Squash Networks • Parameterized differentiable functions composed out of simpler parameterized differentiable functions, some nonlinear

Squash Networks • Parameterized differentiable functions composed out of simpler parameterized differentiable functions, some nonlinear From Jack (2010), Dynamic System Modeling and Control, goo.gl/pGvJPS *Yes, rectified linear units (relus) are only half-squash; hat-tip Martha White.

Squash Networks • Parameterized differentiable functions composed out of simpler parameterized differentiable functions, some nonlinear From existentialcomics.com • Estimate parameters using Leibniz (1676)

Who wants an all-squash diet? very Curcurbita much festive many dropout wow

Linguistic Structure Prediction output (structure) input (text)

Linguistic Structure Prediction sequences, trees, output (structure) graphs, … input (text)

Linguistic Structure Prediction sequences, trees, output (structure) graphs, … “gold” output input (text)

Linguistic Structure Prediction sequences, trees, output (structure) graphs, … “gold” output input representation input (text)

Linguistic Structure Prediction sequences, trees, output (structure) graphs, … “gold” output clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction training objective sequences, trees, output (structure) graphs, … “gold” output clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction probabilistic, training objective cost-aware, … sequences, trees, output (structure) graphs, … “gold” output clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction probabilistic, training objective cost-aware, … sequences, trees, output (structure) graphs, … “gold” output part representations clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction probabilistic, training objective cost-aware, … sequences, trees, output (structure) graphs, … “gold” output segments/spans, arcs, part representations graph fragments, … clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction training objective output (structure) “gold” output part representations input representation input (text)

Linguistic Structure Prediction “task” regularization training objective error definitions annotation & weights conventions & theory data selection output (structure) constraints & “gold” output independence assumptions part representations input representation input (text)

Inductive Bias • What does your learning algorithm assume? See also: No Free Lunch Theorem (Mitchell, 1980; Wolpert, 1996) • How will it choose among good predictive functions?

bias data

Three New Models • Parsing sentences into predicate-argument structures • Fillmore frames • Semantic dependency graphs • Language models that dynamically track entities

When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. Original story on Slate.com: http://goo.gl/Hp89tD

Frame-Semantic Analysis When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. FrameNet: https://framenet.icsi.berkeley.edu

Frame-Semantic Analysis cognizer: Democrats topic: why … Clinton When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. FrameNet: https://framenet.icsi.berkeley.edu

Frame-Semantic Analysis time: When … Clinton cognizer agent: they ground: much … 1992 landmark event: Democrats … Clinton degree: so sought entity: ? trajector event: they… 1992 mass: resentment of Clinton cognizer: Democrats time: When … Clinton required situation: they … to look … 1992 topic: why … Clinton entity: so … Clinton explanation: why degree: so much content: of Clinton experiencer: ? When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. helper: Stephanopoulos … Carville topic: about … 1992 goal: to put over time: in 1992 benefited_party: ? trajector event: the Big Lie … over landmark period: 1992 FrameNet: https://framenet.icsi.berkeley.edu

bracket, translate categorize, brood, consider, class, classify contemplate, deliberate, … appraise, commit to agonize, fret, fuss, assess, memory, learn, lose sleep, … evaluate, … memorize, … FrameNet: https://framenet.icsi.berkeley.edu

Frame-Semantic Analysis time: When … Clinton cognizer agent: they ground: much … 1992 landmark event: Democrats … Clinton degree: so sought entity: ? trajector event: they… 1992 mass: resentment of Clinton cognizer: Democrats time: When … Clinton required situation: they … to look … 1992 topic: why … Clinton entity: so … Clinton explanation: why degree: so much content: of Clinton experiencer: ? When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. helper: Stephanopoulos … Carville topic: about … 1992 goal: to put over time: in 1992 benefited_party: ? trajector event: the Big Lie … over landmark period: 1992 FrameNet: https://framenet.icsi.berkeley.edu

When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

biLSTM (contextualized word vectors) When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

… parts: segments up to length d scored by another biLSTM, with labels biLSTM (contextualized word vectors) When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

… output: covering sequence of nonoverlapping segments parts: segments up to length d scored by another biLSTM, with labels biLSTM (contextualized word vectors) When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

training objective: Segmental RNN log loss (Lingpeng Kong, Chris Dyer, N.A.S., ICLR 2016) … output: covering sequence of nonoverlapping segments, recovered in O ( Ldn ); see Sarawagi & Cohen, 2004 parts: segments up to length d scored by another biLSTM, with labels biLSTM (contextualized word vectors) input sequence

When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

∅ cognizer topic ∅ ∅ When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

∅ wonder Cogitation wonder Cogitation cognizer topic ∅ ∅ wonder Cogitation wonder Cogitation When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

Squashing Computational Linguistics Noah A. Smith Paul G. Allen - PowerPoint PPT Presentation

Squashing Computational Linguistics Noah A. Smith Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, USA @nlpnoah Research supported in part by: NSF, DARPA DEFT, DARPA CWC, Facebook, Google, Samsung,

Why Squashing Functions in Shall We Go Beyond . . . Which . . . Multi-Layer Neural Invariance

Special relativity Squashing of the E-field line associated to a moving charge is suggestive

Bug Squashing with SQLsmith andreas.seltenreich@credativ.de October 25, 2018

drm/i915 Updates Daniel Vetter, Intel OTC FOSDEM 2013 bug squashing bugs fixed by the

Squashing the beast into a 60MB cage Tor Lillqvist <tml@collabora.com> tml,

Disclosures Y- Balance Testing in Anterior Cruciate Lau, Tufts, Souza, Li, Feeley, Allen:

JB 9/2001 SF.ppt Conflict of Interest Disclosure Neonatal and Childhood Pulmonary and

Advances in Pulmonary Hypertension Workshop BPD: Challenges in Lung and Pulmonary Vascular

" Progress and Constraints in trea/ng APL in India Dr Mammen Chandy Dr Mammen Chandy

Mit Mitig igating ing Gen Gender er Bia Bias in in NLP: Li Lite teratur ture Re Review

Where we started 2 Accountable Care Organizations ACOs) Community- Based Care Health Homes

Access Control Policies www.skills-1st.co.uk for LDAP Andrew Findlay Skills 1st Ltd

What Kind of Language Is Hard to Language-Model? ACL 2019 Sabrina J. Mielke and Ryan Cotterell,

Explainable Improved Ensembling for Natural Language and Vision Nazneen Rajani University of

Mo Morphology Yonatan Belinkov Nadir Durrani Fahim Dalvi Hassan Sajjad James Glass -

Towards a Computational History of the ACL: 19802008 Ashton Anderson, Dan McFarland, Dan

Access Control Lists Don Porter CSE 506 Background (1) If everything in Unix is a file

Security 2 CS 4410 Operating Systems [E. Birrell, A. Bracy, F. B. Schneider, E. Sirer, R. Van

Public Self-consciousness for Endowing Dialogue Agents with Consistent Persona 2020 BAICS workshop

ZooKeeper Wait-free coordination for Internet-scale systems Patrick Hunt and Mahadev (Yahoo!

Gaussian Mixture Latent Vector Grammars Yanpeng Zhao Liwen Zhang Kewei Tu School of

LaSEWeb : Automating Search Strategies over Semi-Structured Web Data Oleksandr Polozov Sumit

More than just Load Balancing iRODS Using HAProxy Tony Edgin iRODS UGM 2019 Purpose Previous

Making Linux Protection Mechanisms Egalitarian with UserFS Taesoo Kim and Nickolai Zeldovich

Squashing Computational Linguistics Noah A. Smith Paul G. Allen - PowerPoint PPT Presentation

Squashing Computational Linguistics Noah A. Smith Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, USA @nlpnoah Research supported in part by: NSF, DARPA DEFT, DARPA CWC, Facebook, Google, Samsung,

Why Squashing Functions in Shall We Go Beyond . . . Which . . . Multi-Layer Neural Invariance

Special relativity Squashing of the E-field line associated to a moving charge is suggestive

Bug Squashing with SQLsmith andreas.seltenreich@credativ.de October 25, 2018

drm/i915 Updates Daniel Vetter, Intel OTC FOSDEM 2013 bug squashing bugs fixed by the

Squashing the beast into a 60MB cage Tor Lillqvist &lt;tml@collabora.com&gt; tml,

Disclosures Y- Balance Testing in Anterior Cruciate Lau, Tufts, Souza, Li, Feeley, Allen:

JB 9/2001 SF.ppt Conflict of Interest Disclosure Neonatal and Childhood Pulmonary and

Advances in Pulmonary Hypertension Workshop BPD: Challenges in Lung and Pulmonary Vascular

&quot; Progress and Constraints in trea/ng APL in India Dr Mammen Chandy Dr Mammen Chandy

Mit Mitig igating ing Gen Gender er Bia Bias in in NLP: Li Lite teratur ture Re Review

Where we started 2 Accountable Care Organizations ACOs) Community- Based Care Health Homes

Access Control Policies www.skills-1st.co.uk for LDAP Andrew Findlay Skills 1st Ltd

What Kind of Language Is Hard to Language-Model? ACL 2019 Sabrina J. Mielke and Ryan Cotterell,

Explainable Improved Ensembling for Natural Language and Vision Nazneen Rajani University of

Mo Morphology Yonatan Belinkov Nadir Durrani Fahim Dalvi Hassan Sajjad James Glass -

Towards a Computational History of the ACL: 19802008 Ashton Anderson, Dan McFarland, Dan

Access Control Lists Don Porter CSE 506 Background (1) If everything in Unix is a file

Security 2 CS 4410 Operating Systems [E. Birrell, A. Bracy, F. B. Schneider, E. Sirer, R. Van

Public Self-consciousness for Endowing Dialogue Agents with Consistent Persona 2020 BAICS workshop

ZooKeeper Wait-free coordination for Internet-scale systems Patrick Hunt and Mahadev (Yahoo!

Gaussian Mixture Latent Vector Grammars Yanpeng Zhao Liwen Zhang Kewei Tu School of

LaSEWeb : Automating Search Strategies over Semi-Structured Web Data Oleksandr Polozov Sumit

More than just Load Balancing iRODS Using HAProxy Tony Edgin iRODS UGM 2019 Purpose Previous

Making Linux Protection Mechanisms Egalitarian with UserFS Taesoo Kim and Nickolai Zeldovich

Squashing the beast into a 60MB cage Tor Lillqvist <tml@collabora.com> tml,

" Progress and Constraints in trea/ng APL in India Dr Mammen Chandy Dr Mammen Chandy