structured predictions
play

Structured Predictions: Practical Advancements and Applications - PowerPoint PPT Presentation

Structured Predictions: Practical Advancements and Applications Kai-Wei Chang University of Virginia Department of Computer Science References: http://kwchang.net/talks/sp.html Kai-Wei Chang (http://kwchang.net/talks/sp.html) 1 Supervised


  1. Structured Predictions: Practical Advancements and Applications Kai-Wei Chang University of Virginia Department of Computer Science References: http://kwchang.net/talks/sp.html Kai-Wei Chang (http://kwchang.net/talks/sp.html) 1

  2. Supervised learning Input Output Target function y = 𝑔 ∗ ( x ) x ∈ X y ∈ Y Learned Model y = 𝑔 𝑦 An item y An item x drawn from a label drawn from an space Y instance space X Kai-Wei Chang (http://kwchang.net/talks/sp.html) 2

  3. Q: [Chris] = [Mr. Robin] ? Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy , Chris lived in a pretty home called Cotchfield Farm . When Chris was three years old, his father wrote a poem about him . The poem was printed in a magazine for others to read. Mr. Robin then wrote a book Slide modified from Dan Roth Kai-Wei Chang (http://kwchang.net/talks/sp.html) 3

  4. Complex Decision Structure Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy , Chris lived in a pretty home called Cotchfield Farm . When Chris was three years old, his father wrote a poem about him . The poem was printed in a magazine for others to read. Mr. Robin then wrote a book Kai-Wei Chang (http://kwchang.net/talks/sp.html) 4

  5. Why is structure important? Hand written recognition example v What is this letter? Kai-Wei Chang (http://kwchang.net/talks/sp.html) 5

  6. Structured Prediction Assign values to a set of interdependent output variables Task Input Output Part-of-speech They operate Pronoun Verb Noun And Noun Tagging ships and banks. Dependency They operate Root They operate ships and banks . Parsing ships and banks. Segmentation Kai-Wei Chang (http://kwchang.net/talks/sp.html) 6

  7. Challenge: Scalability Issues Algorithm 2 is shown to perform a local-optimality guarantee. Robin is alive and well. He is the same better Berg-Kirkpatrick, ACL Bill Clinton , recently elected as the President of Can learning to search work even Methods for learning to search Consequently, LOLS can Robin is alive and well. He is the person that you read about in the book, 2010. It can also be expected to the USA , has been invited by the Russian when the reference is poor? for structured prediction typically improve upon the reference President] , [Vladimir Putin , to visit Russia . same person that you read about Winnie the Pooh. As a boy, Chris lived converge faster -- anyway, the E- President Clinton said that he looks forward to We provide a new learning to imitate a reference policy, with policy, unlike previous in the book, Winnie the Pooh. As in a pretty home called Cotchfield step changes the auxiliary strengthening ties between U SA and Russia search algorithm, LOLS, which existing theoretical guarantees algorithms. This enables us to a boy, Chris lived in a pretty Farm. When Chris was three years old, function by changing the does well relative to the demonstrating low regret develop structured contextual home called Cotchfield Farm. his father wrote a poem about him. The expected counts, so there's no reference policy, but additionally compared to that reference. This bandits, a partial information When Chris was three years old, poem was printed in a magazine for point in finding a local maximum guarantees low regret compared is unsatisfactory in many structured prediction setting with his father wrote a poem about others to read. Mr. Robin then wrote a of the auxiliary to deviations from the learned applications where the reference many potential applications. him. The poem was printed in a book function in each iteration policy. policy is suboptimal and the goal magazine for others to read. Mr. of learning is to Robin then wrote a book v Large amount of data v Complex decision structure Kai-Wei Chang (http://kwchang.net/talks/sp.html) 7

  8. Solution Methods v Assume a graphical structure; optimize v Use within various structured predictions algorithms (e.g., CRF, Structured Perceptron, M3N, Structured SVM) [Lafferty+ 01, Collins02, Taskar04] v See our AAAI16 tutorial ( https://goo.gl/TF7cGj) v Learning to search approaches v Assume the complex decision is incrementally constructed by a sequence of decisions v E.g., LASO, dagger, Searn, transition-based methods v See our NAACL15 tutorials (http://hunch.net/~l2s) Kai-Wei Chang (http://kwchang.net/talks/sp.html) 8

  9. Example: Dependency Parsing v Identifying relations between words I ate a cake with a fork I ate a cake with a fork Kai-Wei Chang (http://kwchang.net/talks/sp.html) 9

  10. Graphical Model Approaches: Graph-Based Parser [McDonald+. 2005] v Consider all word pairs and assign scores v Score of a tree = sum of score of edges v Can be formulated as a MST problem v Chu-Liu-Edmonds Kai-Wei Chang (http://kwchang.net/talks/sp.html) 10

  11. Learning to search approaches Shift-Reduce parser [Nivre03,NIPS16] v Maintain a buffer and a stack v Make predictions from left to right v Three (four) types of actions: Shift, Reduce-Left, Reduce-Right Credit: Google research blog Kai-Wei Chang (http://kwchang.net/talks/sp.html) 11

  12. What We Care about 65 Stanford Chen+ Ours (2012) 60 Martschat+ Ours (2013) Fernandes+ 55 HOTCoref Berkeley Ours (2015) 50 Prediction accuracy Training/test/dev speed activity cooking agent woman food vegetable Learning signals Fairness (data biases) Kai-Wei Chang (http://kwchang.net/talks/sp.html) 12

  13. Outline 65 Stanford Chen+ Ours (2012) 60 Martschat+ Ours (2013) Fernandes+ 55 HOTCoref Berkeley Ours (2015) 50 Prediction accuracy Training/test/dev speed activity cooking agent woman food vegetable Learning signals Fairness (data biases) Kai-Wei Chang (http://kwchang.net/talks/sp.html) 13

  14. Structured prediction application: ESL Grammar Error Correction [CoNLL 13, 14] They believe that such situation must be avoided. O situation P a situation P situations O a situations Kai-Wei Chang (http://kwchang.net/talks/sp.html) 14

  15. Structured prediction application: Algebra Word Problems [EMNLP 16] Problem: Equations: Solution: 𝑛 = 40,𝑜 = 10 Kai-Wei Chang (http://kwchang.net/talks/sp.html) 15

  16. Structured prediction application: Co-reference Resolution Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh . As a boy , Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him . The poem was printed in a magazine for others to read. Mr. Robin then wrote a book Kai-Wei Chang (http://kwchang.net/talks/sp.html) 16

  17. Structured prediction application: Co-reference Resolution [EMNLP 13a, ICML14, CoNLL 11,12, 15] Proposed a novel, principled, linguistically motivated model 65 Winner of the CoNLL ST 12 Stanford Chen+ Ours (2012) 60 Martschat+ Performance* Winner of the CoNLL ST 11 Ours (2013) Fernandes+ 55 HOTCoref Berkeley Ours (2015) 50 *Avg ( MUC, B 3 , CEAF ) Latent forest structure The state-of-the-artapproachusingNN&RL achieves 65.73(Clark+16) 17

  18. Co-reference Resolution Demo http://bit.ly/illinoisCoref Kai-Wei Chang (http://kwchang.net/talks/sp.html) 18

  19. Co-reference Resolution v Learn a pairwise similarity measure Christopher Robin is alive and well. He is (local predictor) the same person that you read about in the Example features: book, Winnie the Pooh . As a boy , Chris v same sub-string? lived in a pretty home positions in the paragraph v called Cotchfield Farm. When Chris other 30+ feature types v was three years old, v Key components: his father wrote a poem about him . The v Pairwise classification poem was printed in a magazine for others to v Clustering (jointly or not?) read. Mr. Robin then wrote a book Kai-Wei Chang (http://kwchang.net/talks/sp.html) 19

  20. Decoupling Approach A heuristic to learn the model [Soon+ 01, Bengtson+ 08,CoNLL11] v Decouple learning and inference: Learn a pairwise similarity function Cluster based on this function Kai-Wei Chang (http://kwchang.net/talks/sp.html) 20

  21. Decoupling Approach-Learning As a boy , Chris 1 lived in a pretty home called Cotchfield Farm. When Chris 2 was three years old, his father 3 wrote a poem about him 4 . The poem was printed in a magazine for others to read. Mr. Robin 5 then wrote a book Positive Samples Negative Samples (Chris 1 , him 4 ) (Chris 1 , his father 3 ) (Chris 2 , his father 3 ) ( Chris 2 , him 4 ) (him 4 , his father 3 ) (Chris 1 , Chris 2 ) (Chris 1 , Mr. Robin 5 ) (his father 3 , Mr. Robin 5 ) (Chris 2 , Mr. Robin 5 ) (him 4 , Mr. Robin 5 ) Kai-Wei Chang (http://kwchang.net/talks/sp.html) 21

  22. Greedy Best-Left-Link Clustering [Bill Clinton] , recently elected as the [President of the USA], has been invited by the [Russian President], [Vladimir Putin], to visit [Russia]. [President Clinton] said that [he] looks forward to strengthening ties between [USA] and [Russia] . Kai-Wei Chang (http://kwchang.net/talks/sp.html) 22

  23. Greedy Best-Left-Link Clustering [Bill Clinton] , recently elected as the [President of the USA] , has been invited by the [Russian President], [Vladimir Putin], to visit [Russia]. [President Clinton] said that [he] looks forward to strengthening ties between [USA] and [Russia] . Kai-Wei Chang (http://kwchang.net/talks/sp.html) 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend