CS11-747 Neural Networks for NLP
Advanced Search Algorithms
Graham Neubig https://phontron.com/class/nn4nlp2020/
(Some Slides by Daniel Clothiaux)
Advanced Search Algorithms Graham Neubig - - PowerPoint PPT Presentation
CS11-747 Neural Networks for NLP Advanced Search Algorithms Graham Neubig https://phontron.com/class/nn4nlp2020/ (Some Slides by Daniel Clothiaux) The Generation Problem We have a model of P(Y|X), how do we use it to generate a sentence?
Graham Neubig https://phontron.com/class/nn4nlp2020/
(Some Slides by Daniel Clothiaux)
generate a sentence?
according to the probability distribution.
highest probability.
→ Search
the probability distribution → Sampling
not boring → Sampling? Search?
work needed.
→ impossible! we don't know the reference
→ simple, but not necessarily tied to accuracy
→ which output looks like it has the lowest error? ˆ Y = argmin
˜ Y
error(Y, ˜ Y )
<latexit sha1_base64="pazaO1OUOgQ/R/MsnOhbEaj7I3Q=">ACMHicbVDLSgNBEJz1GeMr6tHLYBAiSNgVQT0IQS8eFYwPsiHMznaSwdnZaZXDMv6SV78E/HiQcWrX+HkgWi0YKC6upqeriCRwqDrvjgTk1PTM7OFueL8wuLScml9cLEqeZQ57GM9VXADEihoI4CJVwlGlgUSLgMbo7/ctb0EbE6hx7CTQj1lGiLThDK7VKJ36XYXad0Pqpyq0TsDMRyFDsGpuKdxhxnQnEirP74claB3rvHK9Tb+dW61S2a26A9C/xBuRMhnhtFV68sOYpxEo5JIZ0/DcBJt2FQouIS/6qYGE8RvWgYalikVgmtng4pxuWiWk7Vjbp5AO1J8TGYuM6UWBdUYMu2a81xf/6zVSbO83M6GSFEHx4aJ2KinGtB8fDYUGjrJnCeNa2L9S3mWacbTRFW0I3vjJf0l9p3pQ9c52y7WjURoFsk42SIV4ZI/UyAk5JXCyQN5Jq/kzXl0Xpx352NonXBGM2vkF5zPLylArDg=</latexit><latexit sha1_base64="pazaO1OUOgQ/R/MsnOhbEaj7I3Q=">ACMHicbVDLSgNBEJz1GeMr6tHLYBAiSNgVQT0IQS8eFYwPsiHMznaSwdnZaZXDMv6SV78E/HiQcWrX+HkgWi0YKC6upqeriCRwqDrvjgTk1PTM7OFueL8wuLScml9cLEqeZQ57GM9VXADEihoI4CJVwlGlgUSLgMbo7/ctb0EbE6hx7CTQj1lGiLThDK7VKJ36XYXad0Pqpyq0TsDMRyFDsGpuKdxhxnQnEirP74claB3rvHK9Tb+dW61S2a26A9C/xBuRMhnhtFV68sOYpxEo5JIZ0/DcBJt2FQouIS/6qYGE8RvWgYalikVgmtng4pxuWiWk7Vjbp5AO1J8TGYuM6UWBdUYMu2a81xf/6zVSbO83M6GSFEHx4aJ2KinGtB8fDYUGjrJnCeNa2L9S3mWacbTRFW0I3vjJf0l9p3pQ9c52y7WjURoFsk42SIV4ZI/UyAk5JXCyQN5Jq/kzXl0Xpx352NonXBGM2vkF5zPLylArDg=</latexit><latexit sha1_base64="pazaO1OUOgQ/R/MsnOhbEaj7I3Q=">ACMHicbVDLSgNBEJz1GeMr6tHLYBAiSNgVQT0IQS8eFYwPsiHMznaSwdnZaZXDMv6SV78E/HiQcWrX+HkgWi0YKC6upqeriCRwqDrvjgTk1PTM7OFueL8wuLScml9cLEqeZQ57GM9VXADEihoI4CJVwlGlgUSLgMbo7/ctb0EbE6hx7CTQj1lGiLThDK7VKJ36XYXad0Pqpyq0TsDMRyFDsGpuKdxhxnQnEirP74claB3rvHK9Tb+dW61S2a26A9C/xBuRMhnhtFV68sOYpxEo5JIZ0/DcBJt2FQouIS/6qYGE8RvWgYalikVgmtng4pxuWiWk7Vjbp5AO1J8TGYuM6UWBdUYMu2a81xf/6zVSbO83M6GSFEHx4aJ2KinGtB8fDYUGjrJnCeNa2L9S3mWacbTRFW0I3vjJf0l9p3pQ9c52y7WjURoFsk42SIV4ZI/UyAk5JXCyQN5Jq/kzXl0Xpx352NonXBGM2vkF5zPLylArDg=</latexit>ˆ Y = argmax
˜ Y
P( ˜ Y |X)
<latexit sha1_base64="daJTM+c0MGzbLqAwsNEmTltAuIU=">ACJnicbVDLSgMxFM34tr6qLt0Ei6CbMiOCulBENy4rWK10SslkbtgJjMkd6RlHL/Gjb/iRvCBuPNTGsRXwcCh3PO5eaeIJHCoOu+OSOjY+MTk1PThZnZufmF4uLSmYlTzaHKYxnrWsAMSKGgigIl1BINLAoknAeXR3/Aq0EbE6xV4CjYi1lWgJztBKzeK+32GYXeR0j/qpCm0SMPNRyBCsmlsKXcyYbkesm+c3lfUvj17T2kazWHL7gD0L/GpESGqDSLj34Y8zQChVwyY+qem2DLkDBJeQFPzWQMH7J2lC3VLEITCMb3JnTNauEtBVr+xTSgfp9ImORMb0osMmIYcf89vrif149xdZOIxMqSREU/1zUSiXFmPZLo6HQwFH2LGFcC/tXyjtM462sItwft98l9S3Szvlr2TrdLB4bCNKbJCVsk68cg2OSDHpEKqhJNbck+eyLNz5zw4L87rZ3TEGc4skx9w3j8AzM6nUg=</latexit><latexit sha1_base64="daJTM+c0MGzbLqAwsNEmTltAuIU=">ACJnicbVDLSgMxFM34tr6qLt0Ei6CbMiOCulBENy4rWK10SslkbtgJjMkd6RlHL/Gjb/iRvCBuPNTGsRXwcCh3PO5eaeIJHCoOu+OSOjY+MTk1PThZnZufmF4uLSmYlTzaHKYxnrWsAMSKGgigIl1BINLAoknAeXR3/Aq0EbE6xV4CjYi1lWgJztBKzeK+32GYXeR0j/qpCm0SMPNRyBCsmlsKXcyYbkesm+c3lfUvj17T2kazWHL7gD0L/GpESGqDSLj34Y8zQChVwyY+qem2DLkDBJeQFPzWQMH7J2lC3VLEITCMb3JnTNauEtBVr+xTSgfp9ImORMb0osMmIYcf89vrif149xdZOIxMqSREU/1zUSiXFmPZLo6HQwFH2LGFcC/tXyjtM462sItwft98l9S3Szvlr2TrdLB4bCNKbJCVsk68cg2OSDHpEKqhJNbck+eyLNz5zw4L87rZ3TEGc4skx9w3j8AzM6nUg=</latexit><latexit sha1_base64="daJTM+c0MGzbLqAwsNEmTltAuIU=">ACJnicbVDLSgMxFM34tr6qLt0Ei6CbMiOCulBENy4rWK10SslkbtgJjMkd6RlHL/Gjb/iRvCBuPNTGsRXwcCh3PO5eaeIJHCoOu+OSOjY+MTk1PThZnZufmF4uLSmYlTzaHKYxnrWsAMSKGgigIl1BINLAoknAeXR3/Aq0EbE6xV4CjYi1lWgJztBKzeK+32GYXeR0j/qpCm0SMPNRyBCsmlsKXcyYbkesm+c3lfUvj17T2kazWHL7gD0L/GpESGqDSLj34Y8zQChVwyY+qem2DLkDBJeQFPzWQMH7J2lC3VLEITCMb3JnTNauEtBVr+xTSgfp9ImORMb0osMmIYcf89vrif149xdZOIxMqSREU/1zUSiXFmPZLo6HQwFH2LGFcC/tXyjtM462sItwft98l9S3Szvlr2TrdLB4bCNKbJCVsk68cg2OSDHpEKqhJNbck+eyLNz5zw4L87rZ3TEGc4skx9w3j8AzM6nUg=</latexit>ˆ Y = argmin
˜ Y
X
Y 0
P(Y 0|X)error(Y 0, ˜ Y )
<latexit sha1_base64="KTzU+V/eNKvP2hqb752FtwYbJj8=">ACQ3icbVBNaxsxFNQ6bZK6beKkx15ETbENxeyGQpNDwLSXHh2oExuvMVrtsy0saRfpbYjZbP5bLv0BvfUX9JDWnItRP6gtHYHBMPMPN7TRKkUFn3/u1faevJ0e2f3Wfn5i5d7+5WDw3ObZIZDhycyMd2IWZBCQwcFSuimBpiKJFxE09z/+ISjBWJ/oKzFAaKjbUYCc7QScNKP5wzHsFPaVhpmOXBMxDFDIGpxaOwhXmzIyV0EVxQ0ObqWHeqxW0Xe/VrsNukyAMYkpnPTuz3BjWKn6TX8BukmCFamSFdrDyrcwTnimQCOXzNp+4Kc4cNtRcAlFOcwspIxP2Rj6jmqmwA7yRQkFfeuUmI4S45GulD/nsiZsnamIpdUDCd23ZuL/P6GY6OB7nQaYag+XLRKJMUEzpvlMbCAEc5c4RxI9ytlE+YRxdm2VXQrD+5U3SOWqeNIOz9XWx1Ubu+Q1eUPqJCAfSIt8Jm3SIZzckh/knvz0vnp3i/vYRkteauZV+QfeL8fARbSsto=</latexit><latexit sha1_base64="KTzU+V/eNKvP2hqb752FtwYbJj8=">ACQ3icbVBNaxsxFNQ6bZK6beKkx15ETbENxeyGQpNDwLSXHh2oExuvMVrtsy0saRfpbYjZbP5bLv0BvfUX9JDWnItRP6gtHYHBMPMPN7TRKkUFn3/u1faevJ0e2f3Wfn5i5d7+5WDw3ObZIZDhycyMd2IWZBCQwcFSuimBpiKJFxE09z/+ISjBWJ/oKzFAaKjbUYCc7QScNKP5wzHsFPaVhpmOXBMxDFDIGpxaOwhXmzIyV0EVxQ0ObqWHeqxW0Xe/VrsNukyAMYkpnPTuz3BjWKn6TX8BukmCFamSFdrDyrcwTnimQCOXzNp+4Kc4cNtRcAlFOcwspIxP2Rj6jmqmwA7yRQkFfeuUmI4S45GulD/nsiZsnamIpdUDCd23ZuL/P6GY6OB7nQaYag+XLRKJMUEzpvlMbCAEc5c4RxI9ytlE+YRxdm2VXQrD+5U3SOWqeNIOz9XWx1Ubu+Q1eUPqJCAfSIt8Jm3SIZzckh/knvz0vnp3i/vYRkteauZV+QfeL8fARbSsto=</latexit><latexit sha1_base64="KTzU+V/eNKvP2hqb752FtwYbJj8=">ACQ3icbVBNaxsxFNQ6bZK6beKkx15ETbENxeyGQpNDwLSXHh2oExuvMVrtsy0saRfpbYjZbP5bLv0BvfUX9JDWnItRP6gtHYHBMPMPN7TRKkUFn3/u1faevJ0e2f3Wfn5i5d7+5WDw3ObZIZDhycyMd2IWZBCQwcFSuimBpiKJFxE09z/+ISjBWJ/oKzFAaKjbUYCc7QScNKP5wzHsFPaVhpmOXBMxDFDIGpxaOwhXmzIyV0EVxQ0ObqWHeqxW0Xe/VrsNukyAMYkpnPTuz3BjWKn6TX8BukmCFamSFdrDyrcwTnimQCOXzNp+4Kc4cNtRcAlFOcwspIxP2Rj6jmqmwA7yRQkFfeuUmI4S45GulD/nsiZsnamIpdUDCd23ZuL/P6GY6OB7nQaYag+XLRKJMUEzpvlMbCAEc5c4RxI9ytlE+YRxdm2VXQrD+5U3SOWqeNIOz9XWx1Ubu+Q1eUPqJCAfSIt8Jm3SIZzckh/knvz0vnp3i/vYRkteauZV+QfeL8fARbSsto=</latexit>(example from Neubig (2015))
criterion does not optimize accuracy
Next word P(next word) Pittsburgh 0.4 New York 0.3 New Jersey 0.25 Other 0.05
probability/score, maintain multiple paths
expanded set
every time step
where score is within a threshold α of best score s1 sn + α > s1
up until probability mass α
5-100 for histogram?)
easy some hard
thing first, then hard thing later I saw the escarpment watashi mita dangai? zeppeki? kyushamen? iwa? watashi ga mita dangai (the escarpment I saw) watashi wa dangai wo mita (I saw the escarpment)
unprocessed words, and search for maximum of sum f(n) = g(n) + h(n)
a neural approximation
(Koehn and Knowles 2017)
worse BLEU score!
risk (best!, covered previously)
(OK)
search errors, but the kind of errors you want (meh)
high-probability outputs p=0.3 I don't know p=0.2 My name is Graham p=0.18 My name is Graham Neubig p=0.17 My name is Neubig ... Higher in Aggregate
minimizes risk
ˆ Y = argmin
˜ Y
X
Y 0
P(Y 0|X)error(Y 0, ˜ Y )
<latexit sha1_base64="KTzU+V/eNKvP2hqb752FtwYbJj8=">ACQ3icbVBNaxsxFNQ6bZK6beKkx15ETbENxeyGQpNDwLSXHh2oExuvMVrtsy0saRfpbYjZbP5bLv0BvfUX9JDWnItRP6gtHYHBMPMPN7TRKkUFn3/u1faevJ0e2f3Wfn5i5d7+5WDw3ObZIZDhycyMd2IWZBCQwcFSuimBpiKJFxE09z/+ISjBWJ/oKzFAaKjbUYCc7QScNKP5wzHsFPaVhpmOXBMxDFDIGpxaOwhXmzIyV0EVxQ0ObqWHeqxW0Xe/VrsNukyAMYkpnPTuz3BjWKn6TX8BukmCFamSFdrDyrcwTnimQCOXzNp+4Kc4cNtRcAlFOcwspIxP2Rj6jmqmwA7yRQkFfeuUmI4S45GulD/nsiZsnamIpdUDCd23ZuL/P6GY6OB7nQaYag+XLRKJMUEzpvlMbCAEc5c4RxI9ytlE+YRxdm2VXQrD+5U3SOWqeNIOz9XWx1Ubu+Q1eUPqJCAfSIt8Jm3SIZzckh/knvz0vnp3i/vYRkteauZV+QfeL8fARbSsto=</latexit><latexit sha1_base64="KTzU+V/eNKvP2hqb752FtwYbJj8=">ACQ3icbVBNaxsxFNQ6bZK6beKkx15ETbENxeyGQpNDwLSXHh2oExuvMVrtsy0saRfpbYjZbP5bLv0BvfUX9JDWnItRP6gtHYHBMPMPN7TRKkUFn3/u1faevJ0e2f3Wfn5i5d7+5WDw3ObZIZDhycyMd2IWZBCQwcFSuimBpiKJFxE09z/+ISjBWJ/oKzFAaKjbUYCc7QScNKP5wzHsFPaVhpmOXBMxDFDIGpxaOwhXmzIyV0EVxQ0ObqWHeqxW0Xe/VrsNukyAMYkpnPTuz3BjWKn6TX8BukmCFamSFdrDyrcwTnimQCOXzNp+4Kc4cNtRcAlFOcwspIxP2Rj6jmqmwA7yRQkFfeuUmI4S45GulD/nsiZsnamIpdUDCd23ZuL/P6GY6OB7nQaYag+XLRKJMUEzpvlMbCAEc5c4RxI9ytlE+YRxdm2VXQrD+5U3SOWqeNIOz9XWx1Ubu+Q1eUPqJCAfSIt8Jm3SIZzckh/knvz0vnp3i/vYRkteauZV+QfeL8fARbSsto=</latexit><latexit sha1_base64="KTzU+V/eNKvP2hqb752FtwYbJj8=">ACQ3icbVBNaxsxFNQ6bZK6beKkx15ETbENxeyGQpNDwLSXHh2oExuvMVrtsy0saRfpbYjZbP5bLv0BvfUX9JDWnItRP6gtHYHBMPMPN7TRKkUFn3/u1faevJ0e2f3Wfn5i5d7+5WDw3ObZIZDhycyMd2IWZBCQwcFSuimBpiKJFxE09z/+ISjBWJ/oKzFAaKjbUYCc7QScNKP5wzHsFPaVhpmOXBMxDFDIGpxaOwhXmzIyV0EVxQ0ObqWHeqxW0Xe/VrsNukyAMYkpnPTuz3BjWKn6TX8BukmCFamSFdrDyrcwTnimQCOXzNp+4Kc4cNtRcAlFOcwspIxP2Rj6jmqmwA7yRQkFfeuUmI4S45GulD/nsiZsnamIpdUDCd23ZuL/P6GY6OB7nQaYag+XLRKJMUEzpvlMbCAEc5c4RxI9ytlE+YRxdm2VXQrD+5U3SOWqeNIOz9XWx1Ubu+Q1eUPqJCAfSIt8Jm3SIZzckh/knvz0vnp3i/vYRkteauZV+QfeL8fARbSsto=</latexit>Ei,j = error(Yi, Yj)
<latexit sha1_base64="WRGOxGKQ3EoG2UrP/eEi1bJpIA=">ACHicbVDLSgNBEJz1GeNr1aOXwSBECGFXBPUgBEXwGMGYSBKW2UlHJ5l9MNMrhiVXL/6KFw8qXv0Eb/6Nk8dBEwsaiqpurv8WAqNjvNtzczOzS8sZpayura+v2xua1jhLFocIjGamazRIEUIFBUqoxQpY4Euo+t2zgV+9B6VF5hL4ZmwG5D0RacoZE8m57qSh0+vSENhAeMAWlItXP3iQG+8zp5n5yiMwSdJu6Y5MgYZc/+arQingQIpdM67rxNhMmULBJfSzjURDzHiX3ULd0JAFoJvp8JM+3TVKi7YjZSpEOlR/T6Qs0LoX+KYzYHinJ72B+J9XT7B91ExFGCcIR8taieSYkQHsdCWUMBR9gxhXAlzK+V3TDGOJrysCcGdfHmaVPaLx0X38iBXOh2nkSHbZIfkiUsOSYlckDKpE4eyTN5JW/Wk/VivVsfo9YZazyzRf7A+vwBQOCY+g=</latexit><latexit sha1_base64="WRGOxGKQ3EoG2UrP/eEi1bJpIA=">ACHicbVDLSgNBEJz1GeNr1aOXwSBECGFXBPUgBEXwGMGYSBKW2UlHJ5l9MNMrhiVXL/6KFw8qXv0Eb/6Nk8dBEwsaiqpurv8WAqNjvNtzczOzS8sZpayura+v2xua1jhLFocIjGamazRIEUIFBUqoxQpY4Euo+t2zgV+9B6VF5hL4ZmwG5D0RacoZE8m57qSh0+vSENhAeMAWlItXP3iQG+8zp5n5yiMwSdJu6Y5MgYZc/+arQingQIpdM67rxNhMmULBJfSzjURDzHiX3ULd0JAFoJvp8JM+3TVKi7YjZSpEOlR/T6Qs0LoX+KYzYHinJ72B+J9XT7B91ExFGCcIR8taieSYkQHsdCWUMBR9gxhXAlzK+V3TDGOJrysCcGdfHmaVPaLx0X38iBXOh2nkSHbZIfkiUsOSYlckDKpE4eyTN5JW/Wk/VivVsfo9YZazyzRf7A+vwBQOCY+g=</latexit><latexit sha1_base64="WRGOxGKQ3EoG2UrP/eEi1bJpIA=">ACHicbVDLSgNBEJz1GeNr1aOXwSBECGFXBPUgBEXwGMGYSBKW2UlHJ5l9MNMrhiVXL/6KFw8qXv0Eb/6Nk8dBEwsaiqpurv8WAqNjvNtzczOzS8sZpayura+v2xua1jhLFocIjGamazRIEUIFBUqoxQpY4Euo+t2zgV+9B6VF5hL4ZmwG5D0RacoZE8m57qSh0+vSENhAeMAWlItXP3iQG+8zp5n5yiMwSdJu6Y5MgYZc/+arQingQIpdM67rxNhMmULBJfSzjURDzHiX3ULd0JAFoJvp8JM+3TVKi7YjZSpEOlR/T6Qs0LoX+KYzYHinJ72B+J9XT7B91ExFGCcIR8taieSYkQHsdCWUMBR9gxhXAlzK+V3TDGOJrysCcGdfHmaVPaLx0X38iBXOh2nkSHbZIfkiUsOSYlckDKpE4eyTN5JW/Wk/VivVsfo9YZazyzRf7A+vwBQOCY+g=</latexit>pi = P(Yi|X)
<latexit sha1_base64="gOw62kTydYi/gouZFHEkDopHVCc=">ACB3icbVC7TsMwFHXKq5RXgJEBiwqpLFWCkIABqYKFsUiEFrVR5DhOa9VxItBqkJGFn6FhQEQK7/Axt/gtBmgcCTLR+fcq3v8RNGpbKsL6MyN7+wuFRdrq2srq1vmJtbNzJOBSYOjlksuj6ShFOHEUVI91EBT5jHT80UXhd+6IkDTm12qcEDdCA05DipHSkmfu9v2YBXIc6S9Lci+jOTyD7catR+E97B54Zt1qWhPAv8QuSR2UaHvmZz+IcRoRrjBDUvZsK1FuhoSimJG81k8lSRAeoQHpacpRKSbTQ7J4b5WAhjGQj+u4ET92ZGhSBa76soIqaGc9QrxP6+XqvDEzShPUkU4ng4KUwZVDItUYEAFwYqNUFYUL0rxEMkEFY6u5oOwZ49+S9xDpunTfvqN46L9Oogh2wBxrABsegBS5BGzgAgwfwBF7Aq/FoPBtvxvu0tGKUPdvgF4yPb91EmM=</latexit><latexit sha1_base64="gOw62kTydYi/gouZFHEkDopHVCc=">ACB3icbVC7TsMwFHXKq5RXgJEBiwqpLFWCkIABqYKFsUiEFrVR5DhOa9VxItBqkJGFn6FhQEQK7/Axt/gtBmgcCTLR+fcq3v8RNGpbKsL6MyN7+wuFRdrq2srq1vmJtbNzJOBSYOjlksuj6ShFOHEUVI91EBT5jHT80UXhd+6IkDTm12qcEDdCA05DipHSkmfu9v2YBXIc6S9Lci+jOTyD7catR+E97B54Zt1qWhPAv8QuSR2UaHvmZz+IcRoRrjBDUvZsK1FuhoSimJG81k8lSRAeoQHpacpRKSbTQ7J4b5WAhjGQj+u4ET92ZGhSBa76soIqaGc9QrxP6+XqvDEzShPUkU4ng4KUwZVDItUYEAFwYqNUFYUL0rxEMkEFY6u5oOwZ49+S9xDpunTfvqN46L9Oogh2wBxrABsegBS5BGzgAgwfwBF7Aq/FoPBtvxvu0tGKUPdvgF4yPb91EmM=</latexit><latexit sha1_base64="gOw62kTydYi/gouZFHEkDopHVCc=">ACB3icbVC7TsMwFHXKq5RXgJEBiwqpLFWCkIABqYKFsUiEFrVR5DhOa9VxItBqkJGFn6FhQEQK7/Axt/gtBmgcCTLR+fcq3v8RNGpbKsL6MyN7+wuFRdrq2srq1vmJtbNzJOBSYOjlksuj6ShFOHEUVI91EBT5jHT80UXhd+6IkDTm12qcEDdCA05DipHSkmfu9v2YBXIc6S9Lci+jOTyD7catR+E97B54Zt1qWhPAv8QuSR2UaHvmZz+IcRoRrjBDUvZsK1FuhoSimJG81k8lSRAeoQHpacpRKSbTQ7J4b5WAhjGQj+u4ET92ZGhSBa76soIqaGc9QrxP6+XqvDEzShPUkU4ng4KUwZVDItUYEAFwYqNUFYUL0rxEMkEFY6u5oOwZ49+S9xDpunTfvqN46L9Oogh2wBxrABsegBS5BGzgAgwfwBF7Aq/FoPBtvxvu0tGKUPdvgF4yPb91EmM=</latexit>(Li et al., 2016)
models, language model
(Kool et. al 2019)
do it without replacement?
elements = sampling from a categorical distribution without replacement
be of variable length
results in gradually decreasing probability
sentences
(Eriguchi et al. 2016)
between sentences
sample from top K instead of enumerate
but sample from remaining hypotheses (Holtzman et al. 2020)
questionable, we could do diverse beam search instead.
consider variance from this in reporting (in addition to variance in training and data selection)
better model you might get worse results, because the search algorithm can't find the outputs your model likes
(Wiseman et al., 2016)
(Goyal et al., 2017)
actions (say, the next word)
BLEU) for MT models
trained with TD
the potential next actions, Q reward Actor: Critic:
similar to REINFORCE style algorithms