Incremental Sampling Without Replacement for Sequence Models Kensen - PowerPoint PPT Presentation

Incremental Sampling Without Replacement for Sequence Models Kensen Shi, David Bieber, Charles Sutuon (Google Research)

Example Motivation Program synthesis: generate a program that satisfjes a given specifjcation Program Specifjcation neural ● I/O examples candidate program ● Symbolic constraints program ● Natural language generator ● Pseudocode meets satisfactory spec? solution no yes Sample candidate programs from the neural generator conditioned on the spec Incrementally: stopping as soon as a satisfactory program is found ● Without replacement: duplicate candidate programs are not useful ●

Motivation, More Generally Neural search in a discrete output space for a solution that satisfjes constraints Sample candidate solutions from the neural generator conditioned on the spec Incrementally: stopping as soon as a satisfactory solution is found ● Without replacement: duplicate candidate solutions are not useful ● Examples of search problems: Program synthesis ● Traveling Salesman Problem: fjnd a tour with cost at most X ● Other combinatorial optimization problems ● SAT and SMT: fjnd assignments to variables to satisfy all constraints ●

Benefjts of Incremental Sampling Incremental sampling enables more fmexibility in stopping conditions. With incremental sampling, one can draw distinct samples until… … a satisfactory solution is found ● … a time limit has passed ● … enough variety is obtained ● … an estimate has converged ● … a target fraction of the search space is explored ● … any arbitrary stopping criterion is met ● Contrast with beam search…

Existing methods of drawing samples Beam search and variants Produces a batch of distinct outputs ● Not incremental ● One does not know upfront how large a batch should be ○ If one batch is insuffjcient, the next batch may have duplicates ○ Naive Monte Carlo I.I.D. sampling This is sampling with replacement since samples are independent ● Rejection sampling Like Monte Carlo I.I.D. sampling, but duplicate samples are discarded ● Potentially ineffjcient if the output distribution is very peaked, as one would ● expect from a well trained neural model

Our Contributions Approaching the sampling problem by manipulating the random choices ● made by the program that generates the samples UniqueRandomizer, a data structure for sampling distinct outputs of a ● randomized program Incremental ○ Samples without replacement ○ Time and memory effjcient ○ Can be extended to supporu batching ○ Describing discrete randomized programs , the broad class of programs that ● UniqueRandomizer can sample from A statistical estimator that applies to samples drawn without replacement ● See paper for details ○

What can we sample from? Discrete randomized programs : def draw_sample(model, h, choice_fn ): All randomness comes from a choice ● tokens = [] function that chooses a random index token = BOS given a discrete probability distribution for i in range(MAX_LEN): Cannot draw random fmoats ● probs, h = model(token, h) token = choice_fn (probs) But, Uniform(0, 1) < 0.3 can be writuen ○ tokens.append(token) as choice_fn([0.3, 0.7]) == 0 if token == EOS: Can accept inputs, e.g., a trained model ● break and problem instance return tokens Can use control fmow including ● A simple randomized program that conditionals, loops, and recursion draws a sample from a recurrent This broad class of programs includes ● sequence model. It uses choice_fn to sequence models! make random decisions.

UniqueRandomizer: Overview UniqueRandomizer is our solution to def sample_wor(draw_sample, model, h, k): incremental sampling without replacement samples = [] Maintains a trie of unsampled ● ur = UniqueRandomizer() probability masses corresponding to for i in range(k): states in the randomized program s = draw_sample(model, h, ur.choice_fn ) samples.append(s) Provides 3 functions: ur.process_termination() Initialization: creates the data structure ● return samples choice_fn : provides choices while ● Using UniqueRandomizer to draw accounting for previous samples samples without replacement from the process_termination : updates the trie ● draw_sample function. to refmect the most recent sample

UniqueRandomizer: Algorithm Summary Trie structure: Each node represents a state of the randomized program, between random ● choices. Each node stores the unsampled probability mass at that state. ● Each edge represents one possible result of one random choice. ● While sampling, maintain a current node that walks down the trie as random choices are made. In choice_fn , use the probability distribution induced by the current node’s ● children to choose a random index to return. Update the current node to the corresponding child. ● In process_termination , subtract the current node’s probability mass from all of its ancestors. Reset the current node back to the trie root.

UniqueRandomizer: Example def draw_sample(choice_fn): sequence = [] length = choice_fn([0.5, 0.4, 0.1]) for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary sequences of length 0 to 2. Note: probability distributions are hardcoded for the sake of example, but in practice they could be computed by a model.

UniqueRandomizer: Example sequence: [] length: ? def draw_sample(choice_fn): i: ? sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 1.0 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary sequences of length 0 to 2.

UniqueRandomizer: Example sequence: [] length: ? def draw_sample(choice_fn): i: ? sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 1.0 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary 0.5 0.4 0.1 sequences of length 0 to 2.

UniqueRandomizer: Example sequence: [] length: ? def draw_sample(choice_fn): i: ? sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 1.0 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary 0.5 0.4 0.1 sequences of length 0 to 2. Choose length using the distribution [0.5, 0.4, 0.1] . Suppose we choose length = 1 (with probability 0.4 ).

UniqueRandomizer: Example sequence: [] length: 1 def draw_sample(choice_fn): i: ? sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 1.0 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary 0.5 0.4 0.1 sequences of length 0 to 2.

UniqueRandomizer: Example sequence: [] length: 1 def draw_sample(choice_fn): i: 0 sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 1.0 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary 0.5 0.4 0.1 sequences of length 0 to 2. 0.3 0.1

UniqueRandomizer: Example sequence: [ 0 ] length: 1 def draw_sample(choice_fn): i: 0 sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 1.0 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary 0.5 0.4 0.1 sequences of length 0 to 2. 0.3 0.1

UniqueRandomizer: Example sequence: [0] length: 1 def draw_sample(choice_fn): i: 1 sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 1.0 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary 0.5 0.4 0.1 sequences of length 0 to 2. The randomized program terminated. In 0.3 0.1 process_termination , we subtract the leaf’s probability mass ( 0.3 ) from all of its ancestors, since the path has been sampled.

UniqueRandomizer: Example sequence: [0] length: 1 def draw_sample(choice_fn): i: 1 sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 0.7 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary 0.5 0.1 0.1 sequences of length 0 to 2. 0.0 0.1

UniqueRandomizer: Example sequence: [] length: ? def draw_sample(choice_fn): i: ? sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 0.7 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary 0.5 0.1 0.1 sequences of length 0 to 2. Run draw_sample again to draw the next 0.0 0.1 sample, without replacement. The trie is preserved from the previous run.

UniqueRandomizer: Example sequence: [] length: ? def draw_sample(choice_fn): i: ? sequence = [] length = choice_fn([0.5, 0.4, 0.1]) 0.7 for i in range(length): sequence.append( choice_fn([0.75, 0.25]) ) return sequence A randomized program that produces binary 0.5 0.1 0.1 sequences of length 0 to 2. Choose length using the unnormalized 0.0 0.1 distribution [0.5, 0.1, 0.1] , which normalizes to approximately [0.71, 0.14, 0.14] .

Incremental Sampling Without Replacement for Sequence Models Kensen - PowerPoint PPT Presentation

Incremental Sampling Without Replacement for Sequence Models Kensen Shi, David Bieber, Charles Sutuon (Google Research) Example Motivation Program synthesis: generate a program that satisfjes a given specifjcation Program Specifjcation neural

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Asynchronous sequence circuits An asynchronous sequence machine is a sequence circuit without

1 Using incremental and/or composite sampling vastly improves the representaKveness of soil or

Incremental Garbage Collection Part II Roland Schatz Incremental Garbage Collection p.1/22

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

1 So, heres our agenda for today. First we are going to talk a bit about the problem and why

Exploiting Incrementality with DBToaster Monitoring Programs Network Monitoring Server Status

Designing an adaptive VM that combines vectorized and JIT execution on heterogeneous hardware

CS6710 Tool Suite Verilog-XL Synthesis and Place & Route Synopsys Behavioral Design

driven ECO Subramanyam Sripada Song Chen Synopsys Inc. Mar 16, 2017 Agenda Background

Torrus software: Overview of challenges and new features Stanislav Sinyagin

A Compiler Representation for Incremental Parallelization Christoph Angerer and Thomas Gross ETH

Resource-bounded functional programming on the JVM and .NET Stephen Gilmore Mobile Resource

Incremental Sampling Without Replacement for Sequence Models Kensen - PowerPoint PPT Presentation

Incremental Sampling Without Replacement for Sequence Models Kensen Shi, David Bieber, Charles Sutuon (Google Research) Example Motivation Program synthesis: generate a program that satisfjes a given specifjcation Program Specifjcation neural

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

SEQUENCE ANALYSIS The term &quot; sequence analysis &quot; in biology implies subjecting a DNA or

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Asynchronous sequence circuits An asynchronous sequence machine is a sequence circuit without

1 Using incremental and/or composite sampling vastly improves the representaKveness of soil or

Incremental Garbage Collection Part II Roland Schatz Incremental Garbage Collection p.1/22

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

1 So, heres our agenda for today. First we are going to talk a bit about the problem and why

Exploiting Incrementality with DBToaster Monitoring Programs Network Monitoring Server Status

Designing an adaptive VM that combines vectorized and JIT execution on heterogeneous hardware

CS6710 Tool Suite Verilog-XL Synthesis and Place &amp; Route Synopsys Behavioral Design

driven ECO Subramanyam Sripada Song Chen Synopsys Inc. Mar 16, 2017 Agenda Background

Torrus software: Overview of challenges and new features Stanislav Sinyagin

A Compiler Representation for Incremental Parallelization Christoph Angerer and Thomas Gross ETH

Resource-bounded functional programming on the JVM and .NET Stephen Gilmore Mobile Resource

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

CS6710 Tool Suite Verilog-XL Synthesis and Place & Route Synopsys Behavioral Design