Structured Predictions: Practical Advancements and Applications - PowerPoint PPT Presentation

Structured Predictions: Practical Advancements and Applications Kai-Wei Chang University of Virginia Department of Computer Science References: http://kwchang.net/talks/sp.html Kai-Wei Chang (http://kwchang.net/talks/sp.html) 1

Supervised learning Input Output Target function y = 𝑔 ∗ ( x ) x ∈ X y ∈ Y Learned Model y = 𝑔 𝑦 An item y An item x drawn from a label drawn from an space Y instance space X Kai-Wei Chang (http://kwchang.net/talks/sp.html) 2

Q: [Chris] = [Mr. Robin] ? Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy , Chris lived in a pretty home called Cotchfield Farm . When Chris was three years old, his father wrote a poem about him . The poem was printed in a magazine for others to read. Mr. Robin then wrote a book Slide modified from Dan Roth Kai-Wei Chang (http://kwchang.net/talks/sp.html) 3

Complex Decision Structure Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy , Chris lived in a pretty home called Cotchfield Farm . When Chris was three years old, his father wrote a poem about him . The poem was printed in a magazine for others to read. Mr. Robin then wrote a book Kai-Wei Chang (http://kwchang.net/talks/sp.html) 4

Why is structure important? Hand written recognition example v What is this letter? Kai-Wei Chang (http://kwchang.net/talks/sp.html) 5

Structured Prediction Assign values to a set of interdependent output variables Task Input Output Part-of-speech They operate Pronoun Verb Noun And Noun Tagging ships and banks. Dependency They operate Root They operate ships and banks . Parsing ships and banks. Segmentation Kai-Wei Chang (http://kwchang.net/talks/sp.html) 6

Challenge: Scalability Issues Algorithm 2 is shown to perform a local-optimality guarantee. Robin is alive and well. He is the same better Berg-Kirkpatrick, ACL Bill Clinton , recently elected as the President of Can learning to search work even Methods for learning to search Consequently, LOLS can Robin is alive and well. He is the person that you read about in the book, 2010. It can also be expected to the USA , has been invited by the Russian when the reference is poor? for structured prediction typically improve upon the reference President] , [Vladimir Putin , to visit Russia . same person that you read about Winnie the Pooh. As a boy, Chris lived converge faster -- anyway, the E- President Clinton said that he looks forward to We provide a new learning to imitate a reference policy, with policy, unlike previous in the book, Winnie the Pooh. As in a pretty home called Cotchfield step changes the auxiliary strengthening ties between U SA and Russia search algorithm, LOLS, which existing theoretical guarantees algorithms. This enables us to a boy, Chris lived in a pretty Farm. When Chris was three years old, function by changing the does well relative to the demonstrating low regret develop structured contextual home called Cotchfield Farm. his father wrote a poem about him. The expected counts, so there's no reference policy, but additionally compared to that reference. This bandits, a partial information When Chris was three years old, poem was printed in a magazine for point in finding a local maximum guarantees low regret compared is unsatisfactory in many structured prediction setting with his father wrote a poem about others to read. Mr. Robin then wrote a of the auxiliary to deviations from the learned applications where the reference many potential applications. him. The poem was printed in a book function in each iteration policy. policy is suboptimal and the goal magazine for others to read. Mr. of learning is to Robin then wrote a book v Large amount of data v Complex decision structure Kai-Wei Chang (http://kwchang.net/talks/sp.html) 7

Solution Methods v Assume a graphical structure; optimize v Use within various structured predictions algorithms (e.g., CRF, Structured Perceptron, M3N, Structured SVM) [Lafferty+ 01, Collins02, Taskar04] v See our AAAI16 tutorial ( https://goo.gl/TF7cGj) v Learning to search approaches v Assume the complex decision is incrementally constructed by a sequence of decisions v E.g., LASO, dagger, Searn, transition-based methods v See our NAACL15 tutorials (http://hunch.net/~l2s) Kai-Wei Chang (http://kwchang.net/talks/sp.html) 8

Example: Dependency Parsing v Identifying relations between words I ate a cake with a fork I ate a cake with a fork Kai-Wei Chang (http://kwchang.net/talks/sp.html) 9

Graphical Model Approaches: Graph-Based Parser [McDonald+. 2005] v Consider all word pairs and assign scores v Score of a tree = sum of score of edges v Can be formulated as a MST problem v Chu-Liu-Edmonds Kai-Wei Chang (http://kwchang.net/talks/sp.html) 10

Learning to search approaches Shift-Reduce parser [Nivre03,NIPS16] v Maintain a buffer and a stack v Make predictions from left to right v Three (four) types of actions: Shift, Reduce-Left, Reduce-Right Credit: Google research blog Kai-Wei Chang (http://kwchang.net/talks/sp.html) 11

What We Care about 65 Stanford Chen+ Ours (2012) 60 Martschat+ Ours (2013) Fernandes+ 55 HOTCoref Berkeley Ours (2015) 50 Prediction accuracy Training/test/dev speed activity cooking agent woman food vegetable Learning signals Fairness (data biases) Kai-Wei Chang (http://kwchang.net/talks/sp.html) 12

Outline 65 Stanford Chen+ Ours (2012) 60 Martschat+ Ours (2013) Fernandes+ 55 HOTCoref Berkeley Ours (2015) 50 Prediction accuracy Training/test/dev speed activity cooking agent woman food vegetable Learning signals Fairness (data biases) Kai-Wei Chang (http://kwchang.net/talks/sp.html) 13

Structured prediction application: ESL Grammar Error Correction [CoNLL 13, 14] They believe that such situation must be avoided. O situation P a situation P situations O a situations Kai-Wei Chang (http://kwchang.net/talks/sp.html) 14

Structured prediction application: Algebra Word Problems [EMNLP 16] Problem: Equations: Solution: 𝑛 = 40,𝑜 = 10 Kai-Wei Chang (http://kwchang.net/talks/sp.html) 15

Structured prediction application: Co-reference Resolution Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh . As a boy , Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him . The poem was printed in a magazine for others to read. Mr. Robin then wrote a book Kai-Wei Chang (http://kwchang.net/talks/sp.html) 16

Structured prediction application: Co-reference Resolution [EMNLP 13a, ICML14, CoNLL 11,12, 15] Proposed a novel, principled, linguistically motivated model 65 Winner of the CoNLL ST 12 Stanford Chen+ Ours (2012) 60 Martschat+ Performance* Winner of the CoNLL ST 11 Ours (2013) Fernandes+ 55 HOTCoref Berkeley Ours (2015) 50 *Avg ( MUC, B 3 , CEAF ) Latent forest structure The state-of-the-artapproachusingNN&RL achieves 65.73(Clark+16) 17

Co-reference Resolution Demo http://bit.ly/illinoisCoref Kai-Wei Chang (http://kwchang.net/talks/sp.html) 18

Co-reference Resolution v Learn a pairwise similarity measure Christopher Robin is alive and well. He is (local predictor) the same person that you read about in the Example features: book, Winnie the Pooh . As a boy , Chris v same sub-string? lived in a pretty home positions in the paragraph v called Cotchfield Farm. When Chris other 30+ feature types v was three years old, v Key components: his father wrote a poem about him . The v Pairwise classification poem was printed in a magazine for others to v Clustering (jointly or not?) read. Mr. Robin then wrote a book Kai-Wei Chang (http://kwchang.net/talks/sp.html) 19

Decoupling Approach A heuristic to learn the model [Soon+ 01, Bengtson+ 08,CoNLL11] v Decouple learning and inference: Learn a pairwise similarity function Cluster based on this function Kai-Wei Chang (http://kwchang.net/talks/sp.html) 20

Decoupling Approach-Learning As a boy , Chris 1 lived in a pretty home called Cotchfield Farm. When Chris 2 was three years old, his father 3 wrote a poem about him 4 . The poem was printed in a magazine for others to read. Mr. Robin 5 then wrote a book Positive Samples Negative Samples (Chris 1 , him 4 ) (Chris 1 , his father 3 ) (Chris 2 , his father 3 ) ( Chris 2 , him 4 ) (him 4 , his father 3 ) (Chris 1 , Chris 2 ) (Chris 1 , Mr. Robin 5 ) (his father 3 , Mr. Robin 5 ) (Chris 2 , Mr. Robin 5 ) (him 4 , Mr. Robin 5 ) Kai-Wei Chang (http://kwchang.net/talks/sp.html) 21

Greedy Best-Left-Link Clustering [Bill Clinton] , recently elected as the [President of the USA], has been invited by the [Russian President], [Vladimir Putin], to visit [Russia]. [President Clinton] said that [he] looks forward to strengthening ties between [USA] and [Russia] . Kai-Wei Chang (http://kwchang.net/talks/sp.html) 22

Greedy Best-Left-Link Clustering [Bill Clinton] , recently elected as the [President of the USA] , has been invited by the [Russian President], [Vladimir Putin], to visit [Russia]. [President Clinton] said that [he] looks forward to strengthening ties between [USA] and [Russia] . Kai-Wei Chang (http://kwchang.net/talks/sp.html) 23

Structured Predictions: Practical Advancements and Applications - PowerPoint PPT Presentation

Structured Predictions: Practical Advancements and Applications Kai-Wei Chang University of Virginia Department of Computer Science References: http://kwchang.net/talks/sp.html Kai-Wei Chang (http://kwchang.net/talks/sp.html) 1 Supervised

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

1 Predictions for 2020 Predictions for 2020 We will live in flying houses. 1966

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Structured Electronic Design Structured Electronic Design ET 8016 5 ECTS credits 1

L101: Introduction to Structured Prediction Ryan Cotterell What is structured prediction?

Semi-structured data Data is not just text, but is not as well- Semi-structured data

Introduction to SparkSQL Structured Data Processing in Spark 1 Structured Data Processing A

Variational Inference for Tutorial Outline Structured NLP Models 1. Structured Models and Factor

Time Predictions in Uber Eats Zi Wang@Uber QCon New York 2019 June 2019 Agenda 1. ML in Uber

On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study

Greater Manchester Cricket A Structured Approach A Structured Approach Introductions John

Structured Finance Department Who we are An excellent structured financier in the local market

Structured Education Referral to structured education following diagnosis is recommended by

(XML from Chapter 20 of text) Outline Why Structured Data? Types of Structured Data

Block stochastic gradient update method Yangyang Xu and Wotao Yin IMA, University of

Emotions ...in life Biblical wisdom for our feelings ...in counselling ...in Scripture

DEFENSE INITIA DEFENSE INITIATED TED VICTIM VICTIM OUTRE OUTREACH Kate Siska Mitigation

Council of Graduate Coordinators and Staff (CGCS) Meeting January 10, 2014 Agenda Items

MDL and the complexity of natural language John Goldsmith University of Chicago/CNRS MoDyCo

Performance Measures: Stochastic Optimization & Statistical Consistency Harikrishna Narasimhan

Information Theory on Convex sets In celebration of Prof. Shunichi Amaris 80 years birthday

Model Combination in Multiclass Classification Sam Reid Advisors: Mike Mozer, Greg Grudic

Structured Predictions: Practical Advancements and Applications - PowerPoint PPT Presentation

Structured Predictions: Practical Advancements and Applications Kai-Wei Chang University of Virginia Department of Computer Science References: http://kwchang.net/talks/sp.html Kai-Wei Chang (http://kwchang.net/talks/sp.html) 1 Supervised

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

1 Predictions for 2020 Predictions for 2020 We will live in flying houses. 1966

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Structured Electronic Design Structured Electronic Design ET 8016 5 ECTS credits 1

L101: Introduction to Structured Prediction Ryan Cotterell What is structured prediction?

Semi-structured data Data is not just text, but is not as well- Semi-structured data

Introduction to SparkSQL Structured Data Processing in Spark 1 Structured Data Processing A

Variational Inference for Tutorial Outline Structured NLP Models 1. Structured Models and Factor

Time Predictions in Uber Eats Zi Wang@Uber QCon New York 2019 June 2019 Agenda 1. ML in Uber

On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study

Greater Manchester Cricket A Structured Approach A Structured Approach Introductions John

Structured Finance Department Who we are An excellent structured financier in the local market

Structured Education Referral to structured education following diagnosis is recommended by

(XML from Chapter 20 of text) Outline Why Structured Data? Types of Structured Data

Block stochastic gradient update method Yangyang Xu and Wotao Yin IMA, University of

Emotions ...in life Biblical wisdom for our feelings ...in counselling ...in Scripture

DEFENSE INITIA DEFENSE INITIATED TED VICTIM VICTIM OUTRE OUTREACH Kate Siska Mitigation

Council of Graduate Coordinators and Staff (CGCS) Meeting January 10, 2014 Agenda Items

MDL and the complexity of natural language John Goldsmith University of Chicago/CNRS MoDyCo

Performance Measures: Stochastic Optimization &amp; Statistical Consistency Harikrishna Narasimhan

Information Theory on Convex sets In celebration of Prof. Shunichi Amaris 80 years birthday

Model Combination in Multiclass Classification Sam Reid Advisors: Mike Mozer, Greg Grudic

Performance Measures: Stochastic Optimization & Statistical Consistency Harikrishna Narasimhan