Beam Search
Shahrzad Kiani and Zihao Chen
CSC2547 Presentation
Beam Search Shahrzad Kiani and Zihao Chen CSC2547 Presentation - - PowerPoint PPT Presentation
Beam Search Shahrzad Kiani and Zihao Chen CSC2547 Presentation Beam Search Greedy Search: Always go to top 1 scored sequence (seq2seq) Beam Search: Maintain the top K scored sequences (this paper) Seq2Seq Train and Test Issues gold sequence
CSC2547 Presentation
Greedy Search: Always go to top 1 scored sequence (seq2seq) Beam Search: Maintain the top K scored sequences (this paper)
gold sequence π§":$ = [π§", β¦ , π§$] predicted sequence * π§":$ = * π§", β¦ , * π§$ Word level
π§$ π§":$0") = ππππ’πππ¦(πππππππ (π§":$0"))
π§$ * π§":$0") = ππππ’πππ¦(πππππππ (* π§":$0")) Sentence level
π§":$ = π§":$ = β$B"
C
π(* π§$ = π§$ |π§":$0")
1.Exposure Bias
Training Loss
π§":$ = π§":$ = β$B"
C
π(* π§$ = π§$ |π§":$0")
πππ = βππ J
$B" C
π * π§$ = π§$ π§":$0" = β K
$
ln(π * π§$ = π§$ π§":$0" ) Testing Evaluation
Training Loss
π§":$ = π§":$ = β$B"
C
π(* π§$ = π§$ |π§":$0")
Testing Evaluation
2.Loss-Evaluation Mismatch
πππ = βππ J
$B" C
π * π§$ = π§$ π§":$0" = β K
$
ln(π * π§$ = π§$ π§":$0" )
sequence
π§":C = πππππππ (π’)
π§":$ = ββ Constrained Beam Search Optimization(ConBSO)
π§
":$ (P)
When 1 + π‘πππ π(* π§
":$ P ) β π‘πππ π(π§$) > 0:
β π = K
$
β * π§
":$ (P) [1 + π‘πππ π(*
π§
":$ P ) β π‘πππ π(π§$)]
Margin Violation
β * π§
":$ (P)
Goals:
be top K
β π = K
$
β * π§
":$ (P) [1 + π‘πππ π(*
π§
":$ P ) β π‘πππ π(π§$)]
β π = β$ β * π§
":$ (P) [1 + π‘πππ π(*
π§
":$ P ) β π‘πππ π(π§$)]
π§
":$ P ) and
π‘πππ π(π§$):π·(πΌ)
π§
":$ P
Each incorrect sequence is an extension of the partial gold sequence Only maintain two sequences, π 2π = π·(πΌ)
Settings
π§":$
P
scaler: 0/1 Features
fish cat eat -> cat eat fish
[Image credit: Sequence-to-Sequence Learning as Beam Search Optimization, Wiseman et al., EMNLPβ 16]
Alleviate the issues of seq2seq
O(T) BPTT with hard constraint A variant of seq2seq with beam search training scheme
2016 Conference on Empirical Methods in Natural Language Processing. 2016.Sbs
Sampling Sequences Without Replacement." International Conference on Machine Learning. 2019.
inc/895968107420746/
information processing systems. 2014.
prediction." Proceedings of the 22nd international conference on Machine learning. ACM, 2005.
bounds
AAAI Conference on Artificial Intelligence. 2018.