Learning Where to Sample in Structured Prediction Tianlin Shi - PowerPoint PPT Presentation

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Adverb Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Noun Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Noun Adjective Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Dissection: Gibbs Sampler A small cost for a local move, but a large number of moves. Example (Part-of-Speech Tagging) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Pronoun Verb Noun Verb Determiner Adjective Noun Source of inefficiency . Some parts are harder, while some are easier. Example (A Better Strategy) x I think now is the right time pass 1 : y Pronoun Verb Adverb Verb Determiner Noun Noun pass 2 : y Noun Adjective A HeteroSampler! (“Heterogeneous Sampler”) – Focus computation to where needed. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 4 / 25

Introduction Framework Definition Action A j updates part y j based on p ( y j | y − j , x ) Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 5 / 25

Introduction Framework Definition Action A j updates part y j based on p ( y j | y − j , x ) Example Example I think now is the right time Input x j x 3 Output y j y 3 Action A j A 3 A 3 samples y 3 from p ( y 3 | y − 3 , x ) Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 5 / 25

Introduction Sampler Template Our sampler: for total of T rounds, do Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 6 / 25

Introduction Sampler Template Our sampler: for total of T rounds, do … … y j − 1 y j y j + 1 t ¡ ¡ Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 6 / 25

Introduction Sampler Template Our sampler: for total of T rounds, do j [ t ] Pick … … y j − 1 y j y j + 1 t ¡ ¡ 1. Pick index j and the action A j . Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 6 / 25

Introduction Sampler Template Our sampler: for total of T rounds, do j [ t ] Pick … … y j − 1 y j y j + 1 A j [ t ] t ¡ ¡ 1. Pick index j and the action A j . 2. Sample y j using A j Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 6 / 25

Introduction Sampler Template Our sampler: for total of T rounds, do j [ t ] Pick … … … … y j − 1 y j y j + 1 y j − 1 y j ' y j + 1 A j [ t ] y t ¡+ ¡1 ¡ t ¡ ¡ 9 1. Pick index j and the action A j . 2. Sample y j using A j Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 6 / 25

Introduction Sampler Template Our sampler: for total of T rounds, do j [ t ] Pick … … … … y j − 1 y j y j + 1 y j − 1 y j ' y j + 1 A j [ t ] y t ¡+ ¡1 ¡ t ¡ ¡ 9 1. Pick index j and the action A j . 2. Sample y j using A j Example ◮ Cyclic Gibbs sampler. Pick j round-robin. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 6 / 25

Introduction Sampler Template Our sampler: for total of T rounds, do j [ t ] Pick … … … … y j − 1 y j y j + 1 y j − 1 y j ' y j + 1 A j [ t ] y t ¡+ ¡1 ¡ t ¡ ¡ 9 1. Pick index j and the action A j . 2. Sample y j using A j Example ◮ Cyclic Gibbs sampler. Pick j round-robin. ◮ Random-Scan Gibbs sampler. Pick j uniformly at random. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 6 / 25

Introduction Sampler Template Our sampler: for total of T rounds, do j [ t ] Pick … … … … y j − 1 y j y j + 1 y j − 1 y j ' y j + 1 A j [ t ] y t ¡+ ¡1 ¡ t ¡ ¡ 9 1. Pick index j and the action A j . 2. Sample y j using A j Example ◮ Cyclic Gibbs sampler. Pick j round-robin. ◮ Random-Scan Gibbs sampler. Pick j uniformly at random. How to choose j ? Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 6 / 25

Reinforcement Learning Outline Introduction Reinforcement Learning Meta-Features Experiments Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 7 / 25

Reinforcement Learning Reinforcement Learning ◮ State = Entire history with configurations y [ i ] and choices j [ i ] s t = ( y [0] . . . , y [ t ] , j [0] , . . . , j [ t − 1]) Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 8 / 25

Reinforcement Learning Reinforcement Learning ◮ State = Entire history with configurations y [ i ] and choices j [ i ] s t = ( y [0] . . . , y [ t ] , j [0] , . . . , j [ t − 1]) ◮ Action = Pick a j [ t ] to sample a t = A j [ t ] Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 8 / 25

Reinforcement Learning Reinforcement Learning ◮ State = Entire history with configurations y [ i ] and choices j [ i ] s t = ( y [0] . . . , y [ t ] , j [0] , . . . , j [ t − 1]) ◮ Action = Pick a j [ t ] to sample a t = A j [ t ] ◮ Transition = Pre-Trained Model p ( y j | y ¬ j , x ) Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 8 / 25

Reinforcement Learning Reinforcement Learning ◮ State = Entire history with configurations y [ i ] and choices j [ i ] s t = ( y [0] . . . , y [ t ] , j [0] , . . . , j [ t − 1]) ◮ Action = Pick a j [ t ] to sample a t = A j [ t ] ◮ Transition = Pre-Trained Model p ( y j | y ¬ j , x ) ◮ Reward = Improvement in log-probability R ( s t , a t , s t +1 ) = log p ( y [ t +1] | x ) − log p ( y [ t ] | x ) Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 8 / 25

Reinforcement Learning ◮ Policy π : How to pick. π : states → actions Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 9 / 25

Reinforcement Learning ◮ Policy π : How to pick. π : states → actions ◮ The expected cumulative reward R T T − 1 � E [ R T ] = E [ R ( s t , a t , s t +1 )] . t =0 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 9 / 25

Reinforcement Learning ◮ Policy π : How to pick. π : states → actions ◮ The expected cumulative reward R T T − 1 � E [ R T ] = E [ R ( s t , a t , s t +1 )] . t =0 Remark Cumulative reward = log p ( y [ T ] | x ) − log p ( y [0] | x ) (1) Maximizing cumulative reward is equivalent to maximizing probability of final sample. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 9 / 25

Reinforcement Learning Learning Algorithm Inspired by standard RL (Q-learning [Watkins et al. 1992], SARSA [Rummery et al. 1994]) , Q ( s , a ) Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 10 / 25

Reinforcement Learning Learning Algorithm Inspired by standard RL (Q-learning [Watkins et al. 1992], SARSA [Rummery et al. 1994]) , Q ( s , a ) := how good it is to take action a in state s Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 10 / 25

Reinforcement Learning Learning Algorithm Inspired by standard RL (Q-learning [Watkins et al. 1992], SARSA [Rummery et al. 1994]) , Q ( s , a ) := how good it is to take action a in state s Example x I think now is the right time y [0] P V Adv V DT N N Q ( s , a ) 0.0 0.0 2.0 0.0 0.0 2.3 0.0 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 10 / 25

Reinforcement Learning Applying RL Challenge: Efficiency Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 11 / 25

Reinforcement Learning Applying RL Challenge: Efficiency Q ( s , a ) should be cheap to compute, so it does not become the computational bottleneck. Cost to compute Q(s,a) Cost to Sample Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 11 / 25

Reinforcement Learning Applying RL: Efficiency Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 12 / 25

Reinforcement Learning Applying RL: Efficiency 1. Exploit locality. Model Q ( s , a ) using local meta-features only. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 12 / 25

Reinforcement Learning Applying RL: Efficiency 0 ¡ 1. Exploit locality. Model Q ( s , a ) using local meta-features only. 2. Catch. Local meta-features can’t predict cumulative reward. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 12 / 25

Reinforcement Learning Applying RL: Efficiency 1 ¡ 0 ¡ 1. Exploit locality. Model Q ( s , a ) using local meta-features only. 2. Catch. Local meta-features can’t predict cumulative reward. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 12 / 25

Reinforcement Learning Applying RL: Efficiency 1 ¡ 0 ¡ 2 ¡ 1. Exploit locality. Model Q ( s , a ) using local meta-features only. 2. Catch. Local meta-features can’t predict cumulative reward. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 12 / 25

Reinforcement Learning Applying RL: Efficiency 0 ¡ 1 ¡ 2 ¡ R take action 1. Exploit locality. Model Q ( s , a ) using local meta-features only. 2. Catch. Local meta-features can’t predict cumulative reward. 3. Credit Assignment. Isolate the contribution of a . Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 12 / 25

Reinforcement Learning Applying RL: Efficiency 0 ¡ 1 ¡ 2 ¡ R no action 1. Exploit locality. Model Q ( s , a ) using local meta-features only. 2. Catch. Local meta-features can’t predict cumulative reward. 3. Credit Assignment. Isolate the contribution of a . Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 12 / 25

Reinforcement Learning Applying RL: Efficiency 0 ¡ 1 ¡ 2 ¡ R no action 1. Exploit locality. Model Q ( s , a ) using local meta-features only. 2. Catch. Local meta-features can’t predict cumulative reward. 3. Credit Assignment. Isolate the contribution of a . Fit Q ( s , a ) to R{ taking action } − R{ no action } Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 12 / 25

Meta-Features Outline Introduction Reinforcement Learning Meta-Features Experiments Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 13 / 25

Meta-Features List of Meta-Features Meta-Feature Templates Reason about name description conditional entropy cond-ent Uncertainty entropy by unigram model unigram-ent number of times sampled sp Staleness number of neighbors changed nb-vary Discord discord with neighbors nb-discord Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 14 / 25

Meta-Features List of Meta-Features Meta-Feature Templates Reason about name description conditional entropy cond-ent Uncertainty entropy by unigram model unigram-ent number of times sampled sp Stalness number of neighbors changed nb-vary Discord discord with neighbors nb-discord Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 14 / 25

Meta-Features Uncertainty Feature I. Entropy The entropy of q ( y j | y ¬ j , x ). Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 15 / 25

Meta-Features Uncertainty Feature I. Entropy The entropy of q ( y j | y ¬ j , x ). Warning. Computing entropy is as expensive as sampling. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 15 / 25

Meta-Features Uncertainty Feature I. Entropy The entropy of q ( y j | y ¬ j , x ). Warning. Computing entropy is as expensive as sampling. Principle (Use cheap meta-features) Complexity of meta-features ≪ Complexity of sampling Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 15 / 25

Meta-Features Uncertainty Feature I. Entropy The entropy of q ( y j | y ¬ j , x ). Warning. Computing entropy is as expensive as sampling. Principle (Use cheap meta-features) Complexity of meta-features ≪ Complexity of sampling Very-lazy Evaluation … … y j − 1 y j y j + 1 t Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 15 / 25

Meta-Features Uncertainty Feature I. Entropy The entropy of q ( y j | y ¬ j , x ). Warning. Computing entropy is as expensive as sampling. Principle (Use cheap meta-features) Complexity of meta-features ≪ Complexity of sampling Very-lazy Evaluation Meta-Features φ j + 1 φ j − 1 φ j … … y j − 1 y j y j + 1 t Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 15 / 25

Meta-Features Uncertainty Feature I. Entropy The entropy of q ( y j | y ¬ j , x ). Warning. Computing entropy is as expensive as sampling. Principle (Use cheap meta-features) Complexity of meta-features ≪ Complexity of sampling Very-lazy Evaluation Meta-Features φ j + 1 φ j − 1 φ j φ j − 1 φ j ' φ j ' φ j + 1 φ j + 1 … … … … y j − 1 y j y j + 1 y j − 1 y j ' y j + 1 t + 1 t y 9 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 15 / 25

Meta-Features Uncertainty Warning. Entropy alone can be dangerous. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 16 / 25

Meta-Features Uncertainty Warning. Entropy alone can be dangerous. Example x The Duchess was entertaining sp( y 3 ) Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 16 / 25

Meta-Features Uncertainty Warning. Entropy alone can be dangerous. Example x The Duchess was entertaining sp( y 3 ) y [0] Determiner Noun Verb Adjective 0 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 16 / 25

Meta-Features Uncertainty Warning. Entropy alone can be dangerous. Example x The Duchess was entertaining sp( y 3 ) y [0] Determiner Noun Verb Adjective 0 y [1] Determiner Noun Verb Verb 1 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 16 / 25

Meta-Features Uncertainty Warning. Entropy alone can be dangerous. Example x The Duchess was entertaining sp( y 3 ) y [0] Determiner Noun Verb Adjective 0 y [1] Determiner Noun Verb Verb 1 y [2] Determiner Noun Verb Adjective 2 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 16 / 25

Meta-Features Uncertainty Warning. Entropy alone can be dangerous. Example x The Duchess was entertaining sp( y 3 ) y [0] Determiner Noun Verb Adjective 0 y [1] Determiner Noun Verb Verb 1 y [2] Determiner Noun Verb Adjective 2 y [3] Determiner Noun Verb Verb 3 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 16 / 25

Meta-Features Uncertainty Warning. Entropy alone can be dangerous. Example x The Duchess was entertaining sp( y 3 ) y [0] Determiner Noun Verb Adjective 0 y [1] Determiner Noun Verb Verb 1 y [2] Determiner Noun Verb Adjective 2 y [3] Determiner Noun Verb Verb 3 y [4] Determiner Noun Verb Adjective 4 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 16 / 25

Meta-Features Uncertainty Warning. Entropy alone can be dangerous. Example x The Duchess was entertaining sp( y 3 ) y [0] Determiner Noun Verb Adjective 0 y [1] Determiner Noun Verb Verb 1 y [2] Determiner Noun Verb Adjective 2 y [3] Determiner Noun Verb Verb 3 y [4] Determiner Noun Verb Adjective 4 Feature II. Over-exploration number of times y j has been sampled thus far. ◮ simplest measure of the progress in exploration. ◮ usually has negative weight. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 16 / 25

Meta-Features Staleness 3. Change of Markov Blanket #variables in Markov blanket that have changed. ◮ Identify outdated variables. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 17 / 25

Meta-Features Staleness 3. Change of Markov Blanket #variables in Markov blanket that have changed. ◮ Identify outdated variables. vary = 0 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 17 / 25

Meta-Features Staleness 3. Change of Markov Blanket #variables in Markov blanket that have changed. ◮ Identify outdated variables. 0 ¡ vary = 1 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 17 / 25

Meta-Features Staleness 3. Change of Markov Blanket #variables in Markov blanket that have changed. ◮ Identify outdated variables. 1 ¡ 0 ¡ vary = 2 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 17 / 25

Meta-Features Staleness 3. Change of Markov Blanket #variables in Markov blanket that have changed. ◮ Identify outdated variables. 1 ¡ 0 ¡ vary = 3 2 ¡ Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 17 / 25

Meta-Features Staleness 3. Change of Markov Blanket #variables in Markov blanket that have changed. ◮ Identify outdated variables. 1 ¡ 0 ¡ vary = 3 2 ¡ ◮ Reason about very-lazy evaluation. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 17 / 25

Experiments Outline Introduction Reinforcement Learning Meta-Features Experiments Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 18 / 25

Experiments Experiment Outline Tasks Part-of-speech tagging and name-entity recognition. Handwriting recognition. Color inpainting. Scene decomposition. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 19 / 25

Experiments Experiment Outline Tasks Part-of-speech tagging and name-entity recognition. Handwriting recognition. Color inpainting. Scene decomposition. Brief Setting 1. “Steal” a graphical model with transitions. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 19 / 25

Experiments Experiment Outline Tasks Part-of-speech tagging and name-entity recognition. Handwriting recognition. Color inpainting. Scene decomposition. Brief Setting 1. “Steal” a graphical model with transitions. 2. Train the policy using RL on a training set. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 19 / 25

Experiments Experiment Outline Tasks Part-of-speech tagging and name-entity recognition. Handwriting recognition. Color inpainting. Scene decomposition. Brief Setting 1. “Steal” a graphical model with transitions. 2. Train the policy using RL on a training set. 3. Evaluate the policy on a test set. Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 19 / 25

Experiments Speedup on NER 0.80 0.78 F1 sForH 0.76 0.74 0.72 HHtHro6amSlHr FyFliF Gibbs 0.70 0 20 40 60 80 100 120 AvHragH 1umbHr of 7ransitions Actions Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 20 / 25

Experiments 2-5X Speedup across tasks (a) NER (factor size 2) (b) NER (factor size 4) (c) POS (factor size 4) 0.80 0.97 0.80 0.78 0.78 0.96 F1 sForH Accuracy F1 sForH 0.76 0.95 0.76 0.74 0.94 0.74 0.72 0.93 0.72 HHtHro6amSlHr HHtHro6amSlHr HHtHro6amSlHr cyclic Gibbs FyFliF Gibbs FyFliF Gibbs 0.70 0.92 0.70 0 20 40 60 80 100 120 0 10 20 30 40 50 60 70 80 0 20 40 60 80 100 120 AvHragH 1umbHr of 7ransitions AvHragH 1umbHr of 7ransitions Actions AvHragH 1umbHr of 7ransitions Actions Actions (d) OCR (factor size 4) (e) Color Inpainting (f) Scene Decomposition 0.95 890 −6 Log ProbabLlLty (x 10 1 ) 880 −8 0.90 Log 3robabLlLty −10 870 Accuracy 0.85 −12 860 −14 0.80 850 −16 0.75 HHtHro6amSlHr HHtHro6aPSlHr HHtHro6aPSlHr 840 −18 cyclic Gibbs cyclLc GLbbs cyclLc GLbbs 0.70 −20 830 0 20 40 60 80 100 120 140 0 2 4 6 8 10 12 14 16 0 200 400 600 800 1000 AvHragH 1uPbHr of TransLtLons (x 10 5 ) AvHragH 1umbHr of 7ransitions AvHragH 1uPbHr of 7ransLtLons Actions Actions Actions Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 21 / 25

Learning Where to Sample in Structured Prediction Tianlin Shi - PowerPoint PPT Presentation

Learning Where to Sample in Structured Prediction Tianlin Shi Jacob Steinhardt Percy Liang AISTATS 2015 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 1 / 25 Introduction Outline Introduction Reinforcement Learning

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

L101: Introduction to Structured Prediction Ryan Cotterell What is structured prediction?

Training Strategies CS 6355: Structured Prediction 1 So far we saw What is structured output

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

Structured Prediction Final words CS 6355: Structured Prediction 1 A look back What is a

CSCE 496/896 Lecture 11: Structured Prediction and Structured Prediction and Probabilistic

Course Information CS 6355: Structured Prediction Building up structured output prediction

L101: Incremental structured prediction Structured prediction reminder Given an input x (e.g. a

Complex Prediction Problems A novel approach to multiple Structured Output Prediction Yasemin

Section 6 : Cross Validation Yotam Shem-Tov Fall 2014 1/25 Yotam Shem-Tov STAT 239/ PS

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

CSCE 970 Lecture 8: Prediction Stephen Scott Structured Prediction and Vinod Variyam

Dissection-BKW CRYPTO 2018, Santa Barbara , August 20th 2018 Andre Esser , Felix Heuer, Robert

Weighted walks around dissected polygons Conway-Coxeter friezes and beyond Christine

Publicly Available Large Data Sets for Health Outcomes Research: Pearls, Pitfalls, Prices &

Contextual Token Representations ULMfit, OpenAI GPT, ELMo, BERT, XLM Noe Casas Background:

VS SIMPLE HYSTERECTOMY AND PELVIC NODE DISSECTION IN PATIENTS WITH LOW-RISK, EARLY- STAGE

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks David Bau, Jun-Yan

Bivalve Anatomy & Classification Class Bivalvia ~15,000 species; includes clams, scallops,

Scaling limits of random dissections Igor Kortchemski (work with N. Curien and B. Haas) DMA

Learning Where to Sample in Structured Prediction Tianlin Shi - PowerPoint PPT Presentation

Learning Where to Sample in Structured Prediction Tianlin Shi Jacob Steinhardt Percy Liang AISTATS 2015 Tianlin S., J. Steinhardt, P. Liang () HeteroSampler AISTATS 2015 1 / 25 Introduction Outline Introduction Reinforcement Learning

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

L101: Introduction to Structured Prediction Ryan Cotterell What is structured prediction?

Training Strategies CS 6355: Structured Prediction 1 So far we saw What is structured output

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

Structured Prediction Final words CS 6355: Structured Prediction 1 A look back What is a

CSCE 496/896 Lecture 11: Structured Prediction and Structured Prediction and Probabilistic

Course Information CS 6355: Structured Prediction Building up structured output prediction

L101: Incremental structured prediction Structured prediction reminder Given an input x (e.g. a

Complex Prediction Problems A novel approach to multiple Structured Output Prediction Yasemin

Section 6 : Cross Validation Yotam Shem-Tov Fall 2014 1/25 Yotam Shem-Tov STAT 239/ PS

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

CSCE 970 Lecture 8: Prediction Stephen Scott Structured Prediction and Vinod Variyam

Dissection-BKW CRYPTO 2018, Santa Barbara , August 20th 2018 Andre Esser , Felix Heuer, Robert

Weighted walks around dissected polygons Conway-Coxeter friezes and beyond Christine

Publicly Available Large Data Sets for Health Outcomes Research: Pearls, Pitfalls, Prices &amp;

Contextual Token Representations ULMfit, OpenAI GPT, ELMo, BERT, XLM Noe Casas Background:

VS SIMPLE HYSTERECTOMY AND PELVIC NODE DISSECTION IN PATIENTS WITH LOW-RISK, EARLY- STAGE

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks David Bau, Jun-Yan

Bivalve Anatomy &amp; Classification Class Bivalvia ~15,000 species; includes clams, scallops,

Scaling limits of random dissections Igor Kortchemski (work with N. Curien and B. Haas) DMA

Publicly Available Large Data Sets for Health Outcomes Research: Pearls, Pitfalls, Prices &

Bivalve Anatomy & Classification Class Bivalvia ~15,000 species; includes clams, scallops,