Greed is Good if Randomized: New Inference for Dependency - PowerPoint PPT Presentation

Greed ¡is ¡Good ¡if ¡Randomized: ¡New ¡Inference ¡ for ¡Dependency ¡Parsing ¡ Yuan Zhang CSAIL, MIT Joint work with Tao Lei, Regina Barzilay, and Tommi Jaakkola 1 ¡

Inference vs. Scoring Exact ¡ Inference ¡ Approximate ¡ Scoring ¡ Expressive ¡ Limited ¡ Func.on ¡ 2 ¡

Inference vs. Scoring Minimum ¡ Exact ¡ Spanning ¡Tree ¡ Inference ¡ Approximate ¡ Scoring ¡ Expressive ¡ Limited ¡ Func.on ¡ 3 ¡

Inference vs. Scoring Minimum ¡ Exact ¡ Spanning ¡Tree ¡ Inference ¡ Reranking ¡ Approximate ¡ Scoring ¡ Expressive ¡ Limited ¡ Func.on ¡ • Reranking: ¡incorporate ¡arbitrary ¡features ¡ 4 ¡

Inference vs. Scoring Minimum ¡ Exact ¡ Spanning ¡Tree ¡ Dual ¡ DecomposiKon ¡ Inference ¡ Reranking ¡ Approximate ¡ Scoring ¡ Expressive ¡ Limited ¡ Func.on ¡ • Reranking: ¡incorporate ¡arbitrary ¡features ¡ • Dual ¡DecomposiKon: ¡search ¡in ¡full ¡space ¡ 5 ¡

Parsing Complexity • High-‑order ¡parsing ¡is ¡NP-‑hard ¡(McDonald ¡et ¡al., ¡2006) ¡ • Hypothesis: ¡parsing ¡is ¡easy ¡on ¡average ¡ • Many ¡NP-‑hard ¡problems ¡are ¡easy ¡on ¡average ¡ - MAX-‑SAT ¡(Resende ¡et ¡al., ¡1997) ¡ - Set ¡cover ¡(Hochbaum, ¡1982) ¡ 6 ¡

Parsing Complexity • High-‑order ¡parsing ¡is ¡NP-‑hard ¡(McDonald ¡et ¡al., ¡2006) ¡ • Hypothesis: ¡parsing ¡is ¡easy ¡on ¡average ¡ • Many ¡NP-‑hard ¡problems ¡are ¡easy ¡on ¡average ¡ - MAX-‑SAT ¡(Resende ¡et ¡al., ¡1997) ¡ - Set ¡cover ¡(Hochbaum, ¡1982) ¡ We ¡show ¡ • Analysis ¡on ¡average ¡parsing ¡complexity ¡ • A ¡simple ¡inference ¡algorithm ¡based ¡on ¡the ¡analysis ¡ 7 ¡

Our Approach Minimum ¡ Exact ¡ Spanning ¡Tree ¡ Dual ¡ Our ¡ ¡ ¡ ¡ ¡ ¡ DecomposiKon ¡ Approach ¡ Inference ¡ Reranking ¡ Approximate ¡ Scoring ¡ Expressive ¡ Limited ¡ Func.on ¡ • Reranking: ¡incorporate ¡arbitrary ¡features ¡ • Dual ¡DecomposiKon: ¡search ¡in ¡full ¡space ¡ 8 ¡

Core Idea • Climb ¡to ¡the ¡opKmal ¡tree ¡in ¡a ¡few ¡small ¡greedy ¡steps ¡ Randomized ¡Hill-‑climbing ¡ For ¡ k ¡ = ¡1 ¡to ¡ K ¡ 1) Randomly ¡sample ¡a ¡dependency ¡tree ¡ 2) Greedily ¡improve ¡the ¡tree ¡one ¡edge ¡at ¡a ¡Kme ¡ 3) Repeat ¡(2) ¡unKl ¡converge ¡ Select ¡the ¡tree ¡with ¡the ¡highest ¡score ¡ ¡ 9 ¡

Core Idea • Climb ¡to ¡the ¡opKmal ¡tree ¡in ¡a ¡few ¡small ¡greedy ¡steps ¡ Randomized ¡Hill-‑climbing ¡ For ¡ k ¡ = ¡1 ¡to ¡ K ¡ 1) Randomly ¡sample ¡a ¡dependency ¡tree ¡ 2) Greedily ¡improve ¡the ¡tree ¡one ¡edge ¡at ¡a ¡Kme ¡ 3) Repeat ¡(2) ¡unKl ¡converge ¡ Select ¡the ¡tree ¡with ¡the ¡highest ¡score ¡ ¡ That’s ¡it! ¡ 10 ¡

It Works! Dual ¡ Turbo ¡ 88.73% ¡ Decomposi;on ¡ Our ¡Full ¡ 89.44% ¡ Parsing ¡Performance ¡on ¡CoNLL ¡Dataset ¡ 11 ¡

Example “ ¡I ¡ate ¡an ¡apple ¡today” ¡ 12 ¡

Example Initial tree ROOT ¡ apple ate today I an “ ¡I ¡ate ¡an ¡apple ¡today” ¡ 13 ¡

Example Initial tree ROOT ¡ apple ate today I an “ ¡I ¡ate ¡an ¡apple ¡today” ¡ Target tree ROOT ¡ ate I apple today an 14 ¡

Example Initial tree ROOT ¡ apple apple today ate ate today an I an I “ ¡I ¡ate ¡an ¡apple ¡today” ¡ Target tree ROOT ¡ ate I apple today an 15 ¡

Example ROOT ¡ apple apple today ate ate today an I an I “ ¡I ¡ate ¡an ¡apple ¡today” ¡ Target tree ROOT ¡ ate I apple today an 16 ¡

Example ROOT ¡ apple apple today ate ate today an I an I “ ¡I ¡ate ¡an ¡apple ¡today” ¡ Target tree ROOT ¡ ate I apple today an 17 ¡

Example ROOT ¡ apple apple today ate ate an today an I I “ ¡I ¡ate ¡an ¡apple ¡today” ¡ Target tree ROOT ¡ ate I apple today an 18 ¡

Example ROOT ¡ apple apple today ate ate an today an I I “ ¡I ¡ate ¡an ¡apple ¡today” ¡ Target tree ROOT ¡ ate I apple today an 19 ¡

Example ROOT ¡ apple ate apple today ate I an today an I “ ¡I ¡ate ¡an ¡apple ¡today” ¡ Target tree ROOT ¡ ate I apple today an 20 ¡

Example ROOT ¡ apple ate today ate I apple today an an I “ ¡I ¡ate ¡an ¡apple ¡today” ¡ Target tree ROOT ¡ ate I apple today an 24 ¡

Example ROOT ¡ ate I apple today an “ ¡I ¡ate ¡an ¡apple ¡today” ¡ Target tree ROOT ¡ ate I apple today an 25 ¡

Why Greedy Has a Chance to Work ROOT ¡ ROOT ¡ apple ate …… ¡ ate today I apple today I an an y (0) y ( T ) Reachability : ¡transforming ¡any ¡tree ¡to ¡any ¡other ¡tree ¡ • maintaining ¡the ¡structure ¡a ¡valid ¡tree ¡at ¡any ¡point ¡ • using ¡as ¡few ¡as ¡ d ¡steps ¡( d ¡: ¡head ¡differences/hamming ¡distance) ¡ 26 ¡

Greedy Hill-climbing ROOT ¡ ROOT ¡ apple ate …… ¡ ate today I apple today I an an y (0) y ( T ) increase ¡ S ( x , y ( t ) ) 27 ¡

Greedy Hill-climbing ROOT ¡ ROOT ¡ apple ate …… ¡ ate today I apple today I an an y (0) y ( T ) increase ¡ S ( x , y ( t ) ) Arbitrary ¡features ¡in ¡the ¡scoring ¡func;on ¡ 28 ¡

Challenge: Local Optimum ROOT ¡ ROOT ¡ apple ate …… ¡ ate today I apple today I an an y (0) y ( T ) increase ¡ S ( x , y ( t ) ) global ¡opKmum ¡ local ¡opKmum ¡ score ¡ S tree ¡ y 29 ¡

Hill-climbing with Restarts ROOT ¡ ROOT ¡ apple ate …… ¡ ate today I apple today I an an score ¡ S tree ¡ y Overcome ¡local ¡opKma ¡via ¡restarts ¡ 30 ¡

Hill-climbing with Restarts ROOT ¡ ROOT ¡ apple ate …… ¡ ate today I apple today I an an y (0) y ( T ) y (0) y ( T ) Random ¡ iniKalizaKon ¡ max ¡ Hill-‑climbing ¡ (e.g. ¡uniform) ¡ …… ¡ …… ¡ y (0) y ( T ) Overcome ¡local ¡opKma ¡via ¡restarts ¡ 31 ¡

Learning Algorithm • Follow ¡common ¡max-‑margin ¡framework ¡ S ( x , ˆ y ) ≥ S ( x , y ) + | ˆ ∀ y ∈ T ( x ) y − y | − ξ § ¡ ¡ ¡ ¡ ¡is ¡the ¡gold ¡tree ¡ ˆ y 32 ¡

Learning Algorithm • Follow ¡common ¡max-‑margin ¡framework ¡ S ( x , ˆ y ) ≥ S ( x , y ) + | ˆ ∀ y ∈ T ( x ) y − y | − ξ § ¡ ¡ ¡ ¡ ¡is ¡the ¡gold ¡tree ¡ ˆ y • Adopt ¡passive-‑aggressive ¡online ¡learning ¡framework ¡(Crammer ¡et ¡ al. ¡2006) ¡ ¡ • Decode ¡with ¡our ¡randomized ¡greedy ¡algorithm ¡ ¡ 33 ¡

Analysis 34 ¡

Analysis TheoreKcal ¡ Empirical ¡ First-‑order ¡ 35 ¡

Analysis TheoreKcal ¡ Empirical ¡ First-‑order ¡ ? High-‑order ¡ 36 ¡

Analysis TheoreKcal ¡ Empirical ¡ First-‑order ¡ ? High-‑order ¡ 37 ¡

Search Space Complexity: First-order 10 ¡words ¡ 38 ¡

Search Space Complexity: First-order ≈ ¡2 ¡billion ¡trees ¡ 10 ¡words ¡ 39 ¡

Search Space Complexity: First-order ≈ ¡2 ¡billion ¡trees ¡ 10 ¡words ¡ < ¡512 ¡local ¡opKma ¡ 40 ¡

Search Space Complexity: First-order Theorem : ¡For ¡any ¡first-‑order ¡scoring ¡funcKon: ¡ • there ¡are ¡at ¡most ¡2 n-‑1 ¡locally ¡opKmal ¡trees ¡ • this ¡upper ¡bound ¡is ¡ .ght ¡ 41 ¡

Greed is Good if Randomized: New Inference for Dependency - PowerPoint PPT Presentation

Greed is Good if Randomized: New Inference for Dependency Parsing Yuan Zhang CSAIL, MIT Joint work with Tao Lei, Regina Barzilay, and Tommi Jaakkola 1 Inference vs. Scoring Exact Inference

Greedy algorithms: greed is good? Greedy algorithms Greed, for lack of a better word, Coin

Greedy algorithms: greed is good? Greedy algorithms Shortest paths in weighted graphs Greed, for

Greedy algorithms: greed is good? Greedy algorithms Shortest paths in weighted graphs Greed, for

Greedy Algorithms - Gordon Gecko (Michael Douglas) Optimization problem: Min/Max an

Greedy Algorithms Optimization problem: Min/Max an objective. Minimize the total length

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Greed, Leverage, and Potential Losses: A Prospect Theory Perspective Xunyu Zhou Based on joint

CSC373 Week 11: Randomized Algorithms 373F19 - Nisarg Shah & Karan Singh 1 Randomized

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

S TOCHASTIC H ILL C LIMBING (C ONT D ) I. Ljubi and G. R. Raidl , An Evolutionary

Local Search CPSC 322 CSPs 5 Textbook 4.8 Local Search CPSC 322 CSPs 5, Slide 1

Local Search CPSC 322 CSPs 4 Textbook 4.8 Local Search CPSC 322 CSPs 4, Slide 1

CHINAS LEAD IN GLOBAL FINTECH Gregory Gibb Co-Chairman and CEO, Lufax June 6, 2017 1 Lufax

Computing Computational Thinking using Computational Thinking Patterns Authors: Kyu Han Koh et.

Modeling Mutual Context of Object and Human Pose in Human-object Interaction Activities

All-Terrain Vehicle ATV adaptation by ranchers Herding livestock Mending Fences Weed

Personal Rapid Transport (PRT) Ollie MIKOSZA (MSc. Com Sci & Elec Eng.) President, CEO

Greed is Good if Randomized: New Inference for Dependency - PowerPoint PPT Presentation

Greed is Good if Randomized: New Inference for Dependency Parsing Yuan Zhang CSAIL, MIT Joint work with Tao Lei, Regina Barzilay, and Tommi Jaakkola 1 Inference vs. Scoring Exact Inference

Greedy algorithms: greed is good? Greedy algorithms Greed, for lack of a better word, Coin

Greedy algorithms: greed is good? Greedy algorithms Shortest paths in weighted graphs Greed, for

Greedy algorithms: greed is good? Greedy algorithms Shortest paths in weighted graphs Greed, for

Greedy Algorithms - Gordon Gecko (Michael Douglas) Optimization problem: Min/Max an

Greedy Algorithms Optimization problem: Min/Max an objective. Minimize the total length

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Greed, Leverage, and Potential Losses: A Prospect Theory Perspective Xunyu Zhou Based on joint

CSC373 Week 11: Randomized Algorithms 373F19 - Nisarg Shah &amp; Karan Singh 1 Randomized

Natural Language Processing Other Syntactic Models Parsing IV Dan Klein UC Berkeley Dependency

Dependency Parsing CMSC 723 / LING 723 / INST 725 Marine Carpuat Fig credits: Joakim Nivre, Dan

Dependency Grammars and Parsing CMSC 473/673 UMBC Outline Review: PCFGs and CKY Dependency

Thoughts on Learner Data and Motivation Learner Language Dependency Parsing and Dependency

S TOCHASTIC H ILL C LIMBING (C ONT D ) I. Ljubi and G. R. Raidl , An Evolutionary

Local Search CPSC 322 CSPs 5 Textbook 4.8 Local Search CPSC 322 CSPs 5, Slide 1

Local Search CPSC 322 CSPs 4 Textbook 4.8 Local Search CPSC 322 CSPs 4, Slide 1

CHINAS LEAD IN GLOBAL FINTECH Gregory Gibb Co-Chairman and CEO, Lufax June 6, 2017 1 Lufax

Computing Computational Thinking using Computational Thinking Patterns Authors: Kyu Han Koh et.

Modeling Mutual Context of Object and Human Pose in Human-object Interaction Activities

All-Terrain Vehicle ATV adaptation by ranchers Herding livestock Mending Fences Weed

Personal Rapid Transport (PRT) Ollie MIKOSZA (MSc. Com Sci &amp; Elec Eng.) President, CEO

CSC373 Week 11: Randomized Algorithms 373F19 - Nisarg Shah & Karan Singh 1 Randomized

Personal Rapid Transport (PRT) Ollie MIKOSZA (MSc. Com Sci & Elec Eng.) President, CEO