Emma Strubell
Algorithms for NLP
CS 11-711 · Fall 2020
Algorithms for NLP CS 11-711 Fall 2020 Lecture 8: Viterbi, - - PowerPoint PPT Presentation
Algorithms for NLP CS 11-711 Fall 2020 Lecture 8: Viterbi, discriminative sequence labeling, NER Emma Strubell Announcements Project 1 is due tomorrow! You may submit up to 3 days late (out of a budget of 5 total for the semester).
Emma Strubell
CS 11-711 · Fall 2020
2
5 total for the semester).
Hidden Markov models (HMMs)
3
NN3 VB1 MD2
a22 a11 a12 a21 a13 a33 a32 a23 a31 P("aardvark" | NN)
...
P(“will” | NN)
...
P("the" | NN)
...
P(“back” | NN)
...
P("zebra" | NN) B3 P("aardvark" | VB)
...
P(“will” | VB)
...
P("the" | VB)
...
P(“back” | VB)
...
P("zebra" | VB) B1 P("aardvark" | MD)
...
P(“will” | MD)
...
P("the" | MD)
...
P(“back” | MD)
...
P("zebra" | MD) B2
Hidden Markov models (HMMs)
4
Q = q1q2 ...qN a set of N states A = a11 ...aij ...aNN a transition probability matrix A, each ai j representing the probability
j=1 ai j = 1 ∀i
O = o1o2 ...oT a sequence of T observations, each one drawn from a vocabulary V = v1,v2,...,vV B = bi(ot) a sequence of observation likelihoods, also called emission probabili- ties, each expressing the probability of an observation ot being generated from a state qi π = π1,π2,...,πN an initial probability distribution over states. πi is the probability that the Markov chain will start in state i. Some states j may have πj = 0, meaning that they cannot be initial states. Also, Pn
i=1 πi = 1
Hidden Markov models (HMMs)
4
Q = q1q2 ...qN a set of N states A = a11 ...aij ...aNN a transition probability matrix A, each ai j representing the probability
j=1 ai j = 1 ∀i
O = o1o2 ...oT a sequence of T observations, each one drawn from a vocabulary V = v1,v2,...,vV B = bi(ot) a sequence of observation likelihoods, also called emission probabili- ties, each expressing the probability of an observation ot being generated from a state qi π = π1,π2,...,πN an initial probability distribution over states. πi is the probability that the Markov chain will start in state i. Some states j may have πj = 0, meaning that they cannot be initial states. Also, Pn
i=1 πi = 1
Hidden Markov models (HMMs)
4
Q = q1q2 ...qN a set of N states A = a11 ...aij ...aNN a transition probability matrix A, each ai j representing the probability
j=1 ai j = 1 ∀i
O = o1o2 ...oT a sequence of T observations, each one drawn from a vocabulary V = v1,v2,...,vV B = bi(ot) a sequence of observation likelihoods, also called emission probabili- ties, each expressing the probability of an observation ot being generated from a state qi π = π1,π2,...,πN an initial probability distribution over states. πi is the probability that the Markov chain will start in state i. Some states j may have πj = 0, meaning that they cannot be initial states. Also, Pn
i=1 πi = 1
Hidden Markov models (HMMs)
5
Forward Viterbi Forward-backward; Baum-Welch
Hidden Markov models (HMMs)
5
Forward Viterbi Forward-backward; Baum-Welch
6
O = o1, o2, …, on, find the most probable sequence of states Q = q1, q2, …, qn q1 q2 qn …
1 = argmax tn
1
1 | w n 1 )
<latexit sha1_base64="kj8oX+bvtC3juVL/nWM6fABUd8=">ADvHicfVJb9MwFPYaLqNc1sEjL4Zq0kCoagoSvIAm4IEXRJHoNqkuleOcpFZ9iWynWxXlf/BreIW/wL/BScq0rBNHivPp8zk+37lEmeDWDYd/djrBjZu3bu/e6d69d/BXm/4bHVuWEwYVpocxpRC4IrmDjuBJxmBqiMBJxEyw/V/ckKjOVafXPrDGaSponFHnqXlv1CUL6gpXzsPvCr/FhJpU0vN54SqixOPDGmAieYzPKvhs3usPB8Pa8DYIN6CPNjae73cMiTXLJSjHBLV2Gg4zNyuocZwJKLskt5BRtqQpTD1UVIKdFXVxJT7wTIwTbfynHK7ZyxEFldauZeQ9JXULe/WuIq+7m+YueTMruMpyB4o1iZJcYKdx1SkcwPMibUHlBnutWK2oIYy5/vZPbicZgFiBa5dCJOzwiZ19pakSJbt4POm0C4xoOCMaSmpip8XJKGSi3UMCc2FKwtik3/4un69iFc8s5vWXTwpwBFteMoVFQISR6qjTfvfwpH67JKP4Adk4LNX/SUDQ502XkmzE6UfWEqekAr+z5OrC08P2UVtQBfTNUXnYEqyhoyoS2QKDU6z1qCt+Jrof4BmvgxNP7QDms8/JaGV3dyGxyPBuHLwejrq/7R+82+7qLH6Ck6RCF6jY7QJzRGE8TQD/QT/UK/g3dBHCwD2bh2djYxj1DLgtVfxFEqg=</latexit>7
O = o1, o2, …, on, find the most probable sequence of states Q = q1, q2, …, qn
1 = argmax tn
1
1 | w n 1 )
<latexit sha1_base64="kj8oX+bvtC3juVL/nWM6fABUd8=">ADvHicfVJb9MwFPYaLqNc1sEjL4Zq0kCoagoSvIAm4IEXRJHoNqkuleOcpFZ9iWynWxXlf/BreIW/wL/BScq0rBNHivPp8zk+37lEmeDWDYd/djrBjZu3bu/e6d69d/BXm/4bHVuWEwYVpocxpRC4IrmDjuBJxmBqiMBJxEyw/V/ckKjOVafXPrDGaSponFHnqXlv1CUL6gpXzsPvCr/FhJpU0vN54SqixOPDGmAieYzPKvhs3usPB8Pa8DYIN6CPNjae73cMiTXLJSjHBLV2Gg4zNyuocZwJKLskt5BRtqQpTD1UVIKdFXVxJT7wTIwTbfynHK7ZyxEFldauZeQ9JXULe/WuIq+7m+YueTMruMpyB4o1iZJcYKdx1SkcwPMibUHlBnutWK2oIYy5/vZPbicZgFiBa5dCJOzwiZ19pakSJbt4POm0C4xoOCMaSmpip8XJKGSi3UMCc2FKwtik3/4un69iFc8s5vWXTwpwBFteMoVFQISR6qjTfvfwpH67JKP4Adk4LNX/SUDQ502XkmzE6UfWEqekAr+z5OrC08P2UVtQBfTNUXnYEqyhoyoS2QKDU6z1qCt+Jrof4BmvgxNP7QDms8/JaGV3dyGxyPBuHLwejrq/7R+82+7qLH6Ck6RCF6jY7QJzRGE8TQD/QT/UK/g3dBHCwD2bh2djYxj1DLgtVfxFEqg=</latexit>tn
1
1 | tn 1)P(tn 1)
1 )
<latexit sha1_base64="9BSv32hbIdli0R5g8Y34vtwAIU=">AD+3icfVJNb9NAEN0k0Bbz0RSOXBaiSglCUVKQ4IJUAQcuiCRtlI2ROv1OFl1P8zuOm1k+dwQ1z5LYgfg8TaTqu6qRjJ6eZN7szbyZMBLduMPjTaLZu3d7a3rkT3L13/8Fue+/hkdWpYTBmWmhzElILgisYO+4EnCQGqAwFHIen74r48RKM5Vp9casEpLOFY85o867Zu1vwT5ZUJe5fDb8qvAbTKiZS3o+y1zhyPGoWwJMJI/wWQF7wSaLxIaybNQtCSW1DPTW2b38ItbLZ+3OoD8oDW+C4Rp0NpGs72mIZFmqQTlmKDWToaDxE0zahxnAvKApBYSyk7pHCYeKirBTrNSmxzve0+EY238pxwuvVczMiqtXcnQMyV1C3s9Vjhvik1SF7+eZlwlqQPFqofiVGCncSE0jrgB5sTKA8oM97VitqBeJufH4VW/8swCxBJcvREmp5mNy9drJYUyryefV40GxICM6alpCp6lpGYSi5WEcQ0FS7PiI0v8E16PY+WPLFr6S6vFOCINnzOFRUCYkeKo+72v4Uj5RmQ9+AHZOCjr/pTAoY6bXwl1a7kfmBz8oQU8H9Mri6ZHtbysoCfDOFLjoBleUlZEJbIOHc6DSpFbyRXxbqL6CxH0PFh3paxfBbOry+k5vg6KA/fNE/+Pyc/h2va876DF6irpoiF6hQ/QBjdAYMfQb/W1sNbZbet760frZ0VtNtY5j1DNWr/+AWJYWpg=</latexit>7
O = o1, o2, …, on, find the most probable sequence of states Q = q1, q2, …, qn
1 = argmax tn
1
1 | w n 1 )
<latexit sha1_base64="kj8oX+bvtC3juVL/nWM6fABUd8=">ADvHicfVJb9MwFPYaLqNc1sEjL4Zq0kCoagoSvIAm4IEXRJHoNqkuleOcpFZ9iWynWxXlf/BreIW/wL/BScq0rBNHivPp8zk+37lEmeDWDYd/djrBjZu3bu/e6d69d/BXm/4bHVuWEwYVpocxpRC4IrmDjuBJxmBqiMBJxEyw/V/ckKjOVafXPrDGaSponFHnqXlv1CUL6gpXzsPvCr/FhJpU0vN54SqixOPDGmAieYzPKvhs3usPB8Pa8DYIN6CPNjae73cMiTXLJSjHBLV2Gg4zNyuocZwJKLskt5BRtqQpTD1UVIKdFXVxJT7wTIwTbfynHK7ZyxEFldauZeQ9JXULe/WuIq+7m+YueTMruMpyB4o1iZJcYKdx1SkcwPMibUHlBnutWK2oIYy5/vZPbicZgFiBa5dCJOzwiZ19pakSJbt4POm0C4xoOCMaSmpip8XJKGSi3UMCc2FKwtik3/4un69iFc8s5vWXTwpwBFteMoVFQISR6qjTfvfwpH67JKP4Adk4LNX/SUDQ502XkmzE6UfWEqekAr+z5OrC08P2UVtQBfTNUXnYEqyhoyoS2QKDU6z1qCt+Jrof4BmvgxNP7QDms8/JaGV3dyGxyPBuHLwejrq/7R+82+7qLH6Ck6RCF6jY7QJzRGE8TQD/QT/UK/g3dBHCwD2bh2djYxj1DLgtVfxFEqg=</latexit>tn
1
1 | tn 1)P(tn 1)
1 )
<latexit sha1_base64="9BSv32hbIdli0R5g8Y34vtwAIU=">AD+3icfVJNb9NAEN0k0Bbz0RSOXBaiSglCUVKQ4IJUAQcuiCRtlI2ROv1OFl1P8zuOm1k+dwQ1z5LYgfg8TaTqu6qRjJ6eZN7szbyZMBLduMPjTaLZu3d7a3rkT3L13/8Fue+/hkdWpYTBmWmhzElILgisYO+4EnCQGqAwFHIen74r48RKM5Vp9casEpLOFY85o867Zu1vwT5ZUJe5fDb8qvAbTKiZS3o+y1zhyPGoWwJMJI/wWQF7wSaLxIaybNQtCSW1DPTW2b38ItbLZ+3OoD8oDW+C4Rp0NpGs72mIZFmqQTlmKDWToaDxE0zahxnAvKApBYSyk7pHCYeKirBTrNSmxzve0+EY238pxwuvVczMiqtXcnQMyV1C3s9Vjhvik1SF7+eZlwlqQPFqofiVGCncSE0jrgB5sTKA8oM97VitqBeJufH4VW/8swCxBJcvREmp5mNy9drJYUyryefV40GxICM6alpCp6lpGYSi5WEcQ0FS7PiI0v8E16PY+WPLFr6S6vFOCINnzOFRUCYkeKo+72v4Uj5RmQ9+AHZOCjr/pTAoY6bXwl1a7kfmBz8oQU8H9Mri6ZHtbysoCfDOFLjoBleUlZEJbIOHc6DSpFbyRXxbqL6CxH0PFh3paxfBbOry+k5vg6KA/fNE/+Pyc/h2va876DF6irpoiF6hQ/QBjdAYMfQb/W1sNbZbet760frZ0VtNtY5j1DNWr/+AWJYWpg=</latexit>tn
1
1 | tn 1)P(tn 1)
<latexit sha1_base64="ywGkpQDZc7DefYAK7c3O6+bkg=">AEKXicfVLbtNAFHViCsU8msKSzUAUKQEUJaUINkgVsGCDCBJpK2VCNB5fJ6POw5oZp40sfwsfwNewA7b8CGPHreqmMJLHR/eM/cZJpwZOxj8ajT9G1s3b23fDu7cvXd/p7X74NCoVFMYU8WVPg6JAc4kjC2zHI4TDUSEHI7Ck3eF/2gJ2jAlv9hVAlNB5pLFjBLrTLPWt6CDF8RmNp8Nv0r0BmGi54KczTJbGHI06pYAYcEidFrAXtDZpOFYE5qNuiWj5JaOXiXv5e+Xh5cF+RfulmrPegPyoM2wbACba86o9luU+NI0VSAtJQTYybDQWKnGdGWUQ5gFMDCaEnZA4TByURYKZ2ckcdZwlQrHS7pMWldbLiowIY1YidExB7MJc9RXG63yT1MavpxmTSWpB0nWgOXIKlSMBUVMA7V85QChmrlcEV0Q1PrhudmdCnMAvgSbL0QKqaZicvotZRCkdfFZ+tCA6xBwilVQhAZPc1wTATjqwhiknKbZ9jE5/i6fj2PliwxVesunuRgsdJsziThHGKLi6tudr+FxeUd4PfgBqTho8v6UwKaWKVdJuvVyN3A5vgxLuD/mExeMB2sl5WVCbhir6oBGSWl5ByZQCHc63SpJbwhr5M1D1AYjeGNR/qsjXDbenw6k5ugsO9/vBFf+/zfvgbWv294j74nX9YbeK+/A+CNvLFHG1uNZ439xkv/u/D/+n/XlObjUrz0Ksd/89f32dpgA=</latexit>7
O = o1, o2, …, on, find the most probable sequence of states Q = q1, q2, …, qn
1 = argmax tn
1
1 | w n 1 )
<latexit sha1_base64="kj8oX+bvtC3juVL/nWM6fABUd8=">ADvHicfVJb9MwFPYaLqNc1sEjL4Zq0kCoagoSvIAm4IEXRJHoNqkuleOcpFZ9iWynWxXlf/BreIW/wL/BScq0rBNHivPp8zk+37lEmeDWDYd/djrBjZu3bu/e6d69d/BXm/4bHVuWEwYVpocxpRC4IrmDjuBJxmBqiMBJxEyw/V/ckKjOVafXPrDGaSponFHnqXlv1CUL6gpXzsPvCr/FhJpU0vN54SqixOPDGmAieYzPKvhs3usPB8Pa8DYIN6CPNjae73cMiTXLJSjHBLV2Gg4zNyuocZwJKLskt5BRtqQpTD1UVIKdFXVxJT7wTIwTbfynHK7ZyxEFldauZeQ9JXULe/WuIq+7m+YueTMruMpyB4o1iZJcYKdx1SkcwPMibUHlBnutWK2oIYy5/vZPbicZgFiBa5dCJOzwiZ19pakSJbt4POm0C4xoOCMaSmpip8XJKGSi3UMCc2FKwtik3/4un69iFc8s5vWXTwpwBFteMoVFQISR6qjTfvfwpH67JKP4Adk4LNX/SUDQ502XkmzE6UfWEqekAr+z5OrC08P2UVtQBfTNUXnYEqyhoyoS2QKDU6z1qCt+Jrof4BmvgxNP7QDms8/JaGV3dyGxyPBuHLwejrq/7R+82+7qLH6Ck6RCF6jY7QJzRGE8TQD/QT/UK/g3dBHCwD2bh2djYxj1DLgtVfxFEqg=</latexit>tn
1
1 | tn 1)P(tn 1)
1 )
<latexit sha1_base64="9BSv32hbIdli0R5g8Y34vtwAIU=">AD+3icfVJNb9NAEN0k0Bbz0RSOXBaiSglCUVKQ4IJUAQcuiCRtlI2ROv1OFl1P8zuOm1k+dwQ1z5LYgfg8TaTqu6qRjJ6eZN7szbyZMBLduMPjTaLZu3d7a3rkT3L13/8Fue+/hkdWpYTBmWmhzElILgisYO+4EnCQGqAwFHIen74r48RKM5Vp9casEpLOFY85o867Zu1vwT5ZUJe5fDb8qvAbTKiZS3o+y1zhyPGoWwJMJI/wWQF7wSaLxIaybNQtCSW1DPTW2b38ItbLZ+3OoD8oDW+C4Rp0NpGs72mIZFmqQTlmKDWToaDxE0zahxnAvKApBYSyk7pHCYeKirBTrNSmxzve0+EY238pxwuvVczMiqtXcnQMyV1C3s9Vjhvik1SF7+eZlwlqQPFqofiVGCncSE0jrgB5sTKA8oM97VitqBeJufH4VW/8swCxBJcvREmp5mNy9drJYUyryefV40GxICM6alpCp6lpGYSi5WEcQ0FS7PiI0v8E16PY+WPLFr6S6vFOCINnzOFRUCYkeKo+72v4Uj5RmQ9+AHZOCjr/pTAoY6bXwl1a7kfmBz8oQU8H9Mri6ZHtbysoCfDOFLjoBleUlZEJbIOHc6DSpFbyRXxbqL6CxH0PFh3paxfBbOry+k5vg6KA/fNE/+Pyc/h2va876DF6irpoiF6hQ/QBjdAYMfQb/W1sNbZbet760frZ0VtNtY5j1DNWr/+AWJYWpg=</latexit>tn
1
1 | tn 1)P(tn 1)
<latexit sha1_base64="ywGkpQDZc7DefYAK7c3O6+bkg=">AEKXicfVLbtNAFHViCsU8msKSzUAUKQEUJaUINkgVsGCDCBJpK2VCNB5fJ6POw5oZp40sfwsfwNewA7b8CGPHreqmMJLHR/eM/cZJpwZOxj8ajT9G1s3b23fDu7cvXd/p7X74NCoVFMYU8WVPg6JAc4kjC2zHI4TDUSEHI7Ck3eF/2gJ2jAlv9hVAlNB5pLFjBLrTLPWt6CDF8RmNp8Nv0r0BmGi54KczTJbGHI06pYAYcEidFrAXtDZpOFYE5qNuiWj5JaOXiXv5e+Xh5cF+RfulmrPegPyoM2wbACba86o9luU+NI0VSAtJQTYybDQWKnGdGWUQ5gFMDCaEnZA4TByURYKZ2ckcdZwlQrHS7pMWldbLiowIY1YidExB7MJc9RXG63yT1MavpxmTSWpB0nWgOXIKlSMBUVMA7V85QChmrlcEV0Q1PrhudmdCnMAvgSbL0QKqaZicvotZRCkdfFZ+tCA6xBwilVQhAZPc1wTATjqwhiknKbZ9jE5/i6fj2PliwxVesunuRgsdJsziThHGKLi6tudr+FxeUd4PfgBqTho8v6UwKaWKVdJuvVyN3A5vgxLuD/mExeMB2sl5WVCbhir6oBGSWl5ByZQCHc63SpJbwhr5M1D1AYjeGNR/qsjXDbenw6k5ugsO9/vBFf+/zfvgbWv294j74nX9YbeK+/A+CNvLFHG1uNZ439xkv/u/D/+n/XlObjUrz0Ksd/89f32dpgA=</latexit>simplifying assumptions:
7
O = o1, o2, …, on, find the most probable sequence of states Q = q1, q2, …, qn
1 = argmax tn
1
1 | w n 1 )
<latexit sha1_base64="kj8oX+bvtC3juVL/nWM6fABUd8=">ADvHicfVJb9MwFPYaLqNc1sEjL4Zq0kCoagoSvIAm4IEXRJHoNqkuleOcpFZ9iWynWxXlf/BreIW/wL/BScq0rBNHivPp8zk+37lEmeDWDYd/djrBjZu3bu/e6d69d/BXm/4bHVuWEwYVpocxpRC4IrmDjuBJxmBqiMBJxEyw/V/ckKjOVafXPrDGaSponFHnqXlv1CUL6gpXzsPvCr/FhJpU0vN54SqixOPDGmAieYzPKvhs3usPB8Pa8DYIN6CPNjae73cMiTXLJSjHBLV2Gg4zNyuocZwJKLskt5BRtqQpTD1UVIKdFXVxJT7wTIwTbfynHK7ZyxEFldauZeQ9JXULe/WuIq+7m+YueTMruMpyB4o1iZJcYKdx1SkcwPMibUHlBnutWK2oIYy5/vZPbicZgFiBa5dCJOzwiZ19pakSJbt4POm0C4xoOCMaSmpip8XJKGSi3UMCc2FKwtik3/4un69iFc8s5vWXTwpwBFteMoVFQISR6qjTfvfwpH67JKP4Adk4LNX/SUDQ502XkmzE6UfWEqekAr+z5OrC08P2UVtQBfTNUXnYEqyhoyoS2QKDU6z1qCt+Jrof4BmvgxNP7QDms8/JaGV3dyGxyPBuHLwejrq/7R+82+7qLH6Ck6RCF6jY7QJzRGE8TQD/QT/UK/g3dBHCwD2bh2djYxj1DLgtVfxFEqg=</latexit>tn
1
1 | tn 1)P(tn 1)
1 )
<latexit sha1_base64="9BSv32hbIdli0R5g8Y34vtwAIU=">AD+3icfVJNb9NAEN0k0Bbz0RSOXBaiSglCUVKQ4IJUAQcuiCRtlI2ROv1OFl1P8zuOm1k+dwQ1z5LYgfg8TaTqu6qRjJ6eZN7szbyZMBLduMPjTaLZu3d7a3rkT3L13/8Fue+/hkdWpYTBmWmhzElILgisYO+4EnCQGqAwFHIen74r48RKM5Vp9casEpLOFY85o867Zu1vwT5ZUJe5fDb8qvAbTKiZS3o+y1zhyPGoWwJMJI/wWQF7wSaLxIaybNQtCSW1DPTW2b38ItbLZ+3OoD8oDW+C4Rp0NpGs72mIZFmqQTlmKDWToaDxE0zahxnAvKApBYSyk7pHCYeKirBTrNSmxzve0+EY238pxwuvVczMiqtXcnQMyV1C3s9Vjhvik1SF7+eZlwlqQPFqofiVGCncSE0jrgB5sTKA8oM97VitqBeJufH4VW/8swCxBJcvREmp5mNy9drJYUyryefV40GxICM6alpCp6lpGYSi5WEcQ0FS7PiI0v8E16PY+WPLFr6S6vFOCINnzOFRUCYkeKo+72v4Uj5RmQ9+AHZOCjr/pTAoY6bXwl1a7kfmBz8oQU8H9Mri6ZHtbysoCfDOFLjoBleUlZEJbIOHc6DSpFbyRXxbqL6CxH0PFh3paxfBbOry+k5vg6KA/fNE/+Pyc/h2va876DF6irpoiF6hQ/QBjdAYMfQb/W1sNbZbet760frZ0VtNtY5j1DNWr/+AWJYWpg=</latexit>tn
1
1 | tn 1)P(tn 1)
<latexit sha1_base64="ywGkpQDZc7DefYAK7c3O6+bkg=">AEKXicfVLbtNAFHViCsU8msKSzUAUKQEUJaUINkgVsGCDCBJpK2VCNB5fJ6POw5oZp40sfwsfwNewA7b8CGPHreqmMJLHR/eM/cZJpwZOxj8ajT9G1s3b23fDu7cvXd/p7X74NCoVFMYU8WVPg6JAc4kjC2zHI4TDUSEHI7Ck3eF/2gJ2jAlv9hVAlNB5pLFjBLrTLPWt6CDF8RmNp8Nv0r0BmGi54KczTJbGHI06pYAYcEidFrAXtDZpOFYE5qNuiWj5JaOXiXv5e+Xh5cF+RfulmrPegPyoM2wbACba86o9luU+NI0VSAtJQTYybDQWKnGdGWUQ5gFMDCaEnZA4TByURYKZ2ckcdZwlQrHS7pMWldbLiowIY1YidExB7MJc9RXG63yT1MavpxmTSWpB0nWgOXIKlSMBUVMA7V85QChmrlcEV0Q1PrhudmdCnMAvgSbL0QKqaZicvotZRCkdfFZ+tCA6xBwilVQhAZPc1wTATjqwhiknKbZ9jE5/i6fj2PliwxVesunuRgsdJsziThHGKLi6tudr+FxeUd4PfgBqTho8v6UwKaWKVdJuvVyN3A5vgxLuD/mExeMB2sl5WVCbhir6oBGSWl5ByZQCHc63SpJbwhr5M1D1AYjeGNR/qsjXDbenw6k5ugsO9/vBFf+/zfvgbWv294j74nX9YbeK+/A+CNvLFHG1uNZ439xkv/u/D/+n/XlObjUrz0Ksd/89f32dpgA=</latexit>simplifying assumptions:
P(w n
1 | tn 1) ≈ n
Y
i=1
P(wi | ti)
<latexit sha1_base64="/6Tz1U0XhxBSJEhcSTXJxC6YCDg=">AEYnicfVJdb9MwFE23AiN8bGWP8GCoJq0ITc1AgpehCXjgBVEkuk2au8pxblpr/ohsp2tl5X/COz8EJ82mZh1YinN07zn2vT43zjgzt/3drYbN+7/2DrYfjo8ZOn2zudZydG5ZrCkCqu9FlMDHAmYWiZ5XCWaSAi5nAaX34u86cz0IYp+dMuMhgJMpEsZRYHxrv/Ar38JRYZ4txdCHREcJETwSZj50tAwUa7FcAYcESdFXCXri3TsOpJtQN9itGxa0SvVreK65zveIu+b+FYZ1CKzkvzKt5gj7PRk7dhQVF7IksprFeuOdbv+gXy20DqIadIN6DcadDY0TRXMB0lJOjDmP+pkdOaItoxyKEOcGMkIvyQTOPZREgBm5yoIC7flIglKl/SctqKrCkeEMQsRe6Ygdmpu58rgXbnz3KYfRo7JLcg6fKiNOfIKlT6iRKmgVq+8IBQzXytiE6JN8N61725K9dMgc/ANhuhYuRMWt3eKCkWRVM8XzYaYg0SrqgSgsjktcMpEYwvEkhJzm3hsEmv8V3v9SaZsczUT3dzJAeLlWYTJgnkFpcbs2w/0trvYQfwFvkIZvurvGWhilfaVLEeq8IZN8Etcwv8xmbxhethsy1UF+GbKd1EZSFdUkHJlAMcTrfKsUfCavirUH0BSb8OSD03ZkuGnNLo9k+vg5PAgentw+ONd9/hTPa9bwfPgVbAfRMH74Dj4GgyCYUBbH1tJS7Tk5p92O60d5fUjVat2Q0aq/3iLzpsexo=</latexit>7
O = o1, o2, …, on, find the most probable sequence of states Q = q1, q2, …, qn
1 = argmax tn
1
1 | w n 1 )
<latexit sha1_base64="kj8oX+bvtC3juVL/nWM6fABUd8=">ADvHicfVJb9MwFPYaLqNc1sEjL4Zq0kCoagoSvIAm4IEXRJHoNqkuleOcpFZ9iWynWxXlf/BreIW/wL/BScq0rBNHivPp8zk+37lEmeDWDYd/djrBjZu3bu/e6d69d/BXm/4bHVuWEwYVpocxpRC4IrmDjuBJxmBqiMBJxEyw/V/ckKjOVafXPrDGaSponFHnqXlv1CUL6gpXzsPvCr/FhJpU0vN54SqixOPDGmAieYzPKvhs3usPB8Pa8DYIN6CPNjae73cMiTXLJSjHBLV2Gg4zNyuocZwJKLskt5BRtqQpTD1UVIKdFXVxJT7wTIwTbfynHK7ZyxEFldauZeQ9JXULe/WuIq+7m+YueTMruMpyB4o1iZJcYKdx1SkcwPMibUHlBnutWK2oIYy5/vZPbicZgFiBa5dCJOzwiZ19pakSJbt4POm0C4xoOCMaSmpip8XJKGSi3UMCc2FKwtik3/4un69iFc8s5vWXTwpwBFteMoVFQISR6qjTfvfwpH67JKP4Adk4LNX/SUDQ502XkmzE6UfWEqekAr+z5OrC08P2UVtQBfTNUXnYEqyhoyoS2QKDU6z1qCt+Jrof4BmvgxNP7QDms8/JaGV3dyGxyPBuHLwejrq/7R+82+7qLH6Ck6RCF6jY7QJzRGE8TQD/QT/UK/g3dBHCwD2bh2djYxj1DLgtVfxFEqg=</latexit>tn
1
1 | tn 1)P(tn 1)
1 )
<latexit sha1_base64="9BSv32hbIdli0R5g8Y34vtwAIU=">AD+3icfVJNb9NAEN0k0Bbz0RSOXBaiSglCUVKQ4IJUAQcuiCRtlI2ROv1OFl1P8zuOm1k+dwQ1z5LYgfg8TaTqu6qRjJ6eZN7szbyZMBLduMPjTaLZu3d7a3rkT3L13/8Fue+/hkdWpYTBmWmhzElILgisYO+4EnCQGqAwFHIen74r48RKM5Vp9casEpLOFY85o867Zu1vwT5ZUJe5fDb8qvAbTKiZS3o+y1zhyPGoWwJMJI/wWQF7wSaLxIaybNQtCSW1DPTW2b38ItbLZ+3OoD8oDW+C4Rp0NpGs72mIZFmqQTlmKDWToaDxE0zahxnAvKApBYSyk7pHCYeKirBTrNSmxzve0+EY238pxwuvVczMiqtXcnQMyV1C3s9Vjhvik1SF7+eZlwlqQPFqofiVGCncSE0jrgB5sTKA8oM97VitqBeJufH4VW/8swCxBJcvREmp5mNy9drJYUyryefV40GxICM6alpCp6lpGYSi5WEcQ0FS7PiI0v8E16PY+WPLFr6S6vFOCINnzOFRUCYkeKo+72v4Uj5RmQ9+AHZOCjr/pTAoY6bXwl1a7kfmBz8oQU8H9Mri6ZHtbysoCfDOFLjoBleUlZEJbIOHc6DSpFbyRXxbqL6CxH0PFh3paxfBbOry+k5vg6KA/fNE/+Pyc/h2va876DF6irpoiF6hQ/QBjdAYMfQb/W1sNbZbet760frZ0VtNtY5j1DNWr/+AWJYWpg=</latexit>tn
1
1 | tn 1)P(tn 1)
<latexit sha1_base64="ywGkpQDZc7DefYAK7c3O6+bkg=">AEKXicfVLbtNAFHViCsU8msKSzUAUKQEUJaUINkgVsGCDCBJpK2VCNB5fJ6POw5oZp40sfwsfwNewA7b8CGPHreqmMJLHR/eM/cZJpwZOxj8ajT9G1s3b23fDu7cvXd/p7X74NCoVFMYU8WVPg6JAc4kjC2zHI4TDUSEHI7Ck3eF/2gJ2jAlv9hVAlNB5pLFjBLrTLPWt6CDF8RmNp8Nv0r0BmGi54KczTJbGHI06pYAYcEidFrAXtDZpOFYE5qNuiWj5JaOXiXv5e+Xh5cF+RfulmrPegPyoM2wbACba86o9luU+NI0VSAtJQTYybDQWKnGdGWUQ5gFMDCaEnZA4TByURYKZ2ckcdZwlQrHS7pMWldbLiowIY1YidExB7MJc9RXG63yT1MavpxmTSWpB0nWgOXIKlSMBUVMA7V85QChmrlcEV0Q1PrhudmdCnMAvgSbL0QKqaZicvotZRCkdfFZ+tCA6xBwilVQhAZPc1wTATjqwhiknKbZ9jE5/i6fj2PliwxVesunuRgsdJsziThHGKLi6tudr+FxeUd4PfgBqTho8v6UwKaWKVdJuvVyN3A5vgxLuD/mExeMB2sl5WVCbhir6oBGSWl5ByZQCHc63SpJbwhr5M1D1AYjeGNR/qsjXDbenw6k5ugsO9/vBFf+/zfvgbWv294j74nX9YbeK+/A+CNvLFHG1uNZ439xkv/u/D/+n/XlObjUrz0Ksd/89f32dpgA=</latexit>simplifying assumptions:
P(w n
1 | tn 1) ≈ n
Y
i=1
P(wi | ti)
<latexit sha1_base64="/6Tz1U0XhxBSJEhcSTXJxC6YCDg=">AEYnicfVJdb9MwFE23AiN8bGWP8GCoJq0ITc1AgpehCXjgBVEkuk2au8pxblpr/ohsp2tl5X/COz8EJ82mZh1YinN07zn2vT43zjgzt/3drYbN+7/2DrYfjo8ZOn2zudZydG5ZrCkCqu9FlMDHAmYWiZ5XCWaSAi5nAaX34u86cz0IYp+dMuMhgJMpEsZRYHxrv/Ar38JRYZ4txdCHREcJETwSZj50tAwUa7FcAYcESdFXCXri3TsOpJtQN9itGxa0SvVreK65zveIu+b+FYZ1CKzkvzKt5gj7PRk7dhQVF7IksprFeuOdbv+gXy20DqIadIN6DcadDY0TRXMB0lJOjDmP+pkdOaItoxyKEOcGMkIvyQTOPZREgBm5yoIC7flIglKl/SctqKrCkeEMQsRe6Ygdmpu58rgXbnz3KYfRo7JLcg6fKiNOfIKlT6iRKmgVq+8IBQzXytiE6JN8N61725K9dMgc/ANhuhYuRMWt3eKCkWRVM8XzYaYg0SrqgSgsjktcMpEYwvEkhJzm3hsEmv8V3v9SaZsczUT3dzJAeLlWYTJgnkFpcbs2w/0trvYQfwFvkIZvurvGWhilfaVLEeq8IZN8Etcwv8xmbxhethsy1UF+GbKd1EZSFdUkHJlAMcTrfKsUfCavirUH0BSb8OSD03ZkuGnNLo9k+vg5PAgentw+ONd9/hTPa9bwfPgVbAfRMH74Dj4GgyCYUBbH1tJS7Tk5p92O60d5fUjVat2Q0aq/3iLzpsexo=</latexit>P(tn
1) ≈ n
Y
i=1
P(ti | ti−1)
<latexit sha1_base64="D0yONuI54By5i/21zBrSv1JK6g=">AElnicfVLbtQwE3KAiXl0sILEi+GaqUuotWmIMFLUcVF8IJYJHqR6u3KcSa7Vn2JbKftysp/8Ap/xd/gZLNV020ZKc7RzDnjGc8kOWfG9vt/w6Vbndt37i7fi1buP3j4aHXt8b5RhawRxVX+jAhBjiTsGeZ5XCYayAi4XCQnHys4genoA1T8qed5jAUZCxZxix3jVaC1eiLp4Q62w5io8l2kGY6LEg5yNnK0eJBhs1QFiwFJ1VsBd1F2k404S6wUbNqLl1oNfIe+U81iuvk98sjLpNDF0Ken2ea3WOsD/TkWM7cXksKyJrWKwXzTPcRPakeU7HNuOyN1pd72/1a0OLIG7AetDYLS2pHGqaCFAWsqJMUdxP7dDR7RlEMZ4cJATugJGcORh5IMENXj61EXe9JUa0/6RFtfeywhFhzFQknimInZirscp5XeyosNm7oWMyLyxIOrsoKziyClU7gFKmgVo+9YBQzXytiE6In5/1m+L34dI1E+CnYNuNUDF0Jqtvb5WUiLItPp81GmENEs6oEoLI9KXDGRGMT1PISMFt6bDJ5vi693qVnrLcNE93kZKDxUqzMZOEc8gsro62/8mFtdnhD+BH5CGb7q7zloYpX2lcy2sPQDG+PnuIL/YzJ5wfSw3ZarC/DNVO+icpCurCHlygBOxloVeavgBX1dqE9AMj+GR/ashnDb2l8dScXwf72Vvx6a/vHm/XdD82+LgfPghfBRhAHb4Pd4GswCPYCGurwV/g7/N52nf+dz5MqMuhY3mSdCyzuAfNnqMSg=</latexit>8
O = o1, o2, …, on, find the most probable sequence of states Q = q1, q2, …, qn
ˆ tn
1 = argmax tn
1
P(tn
1 | w n 1 ) ≈ argmax tn
1
n
Y
i=1
P(wi | ti)P(ti | ti−1)
<latexit sha1_base64="YKrR6RB8BqJeExmb7CUdAbvYcI=">AFEXiclVLbhMxFJ2EAGV4tXQJSIYqUoOgyhQk2FSqgAUbRJDoQ6rTyPHcSaz6MbI9bSNrtvwAX8MOseUL+Aj+Ac9kWjJNWglL4zm695zr6+szTDkztv93Whea12/cXPpVnj7zt1795dXHuwalWkKO1RxpfeHxABnEnYsxz2Uw1EDnsDY/eFfm9Y9CGKfnFTlLoCzKSLGUWB8arDS+hm08JtbZfBAdSrSFMNEjQU4HzhaBHPXWS4CwYDE6KWAnbM/TcKIJdb31klFy0Snknfys1wnXyS/XBi2qxyaSXp9mp1irDf4FjW1F+KAsiq1isF7N9qyzo69iPJO+N+j+Ff64jwubwtOHmwvNbd6JYLzYOoAmtBtXqDlabGsaKZAGkpJ8YcRN3U9h3RlEOeYgzAymhR2QEBx5KIsD0XemYHLV9JEaJ0v6TFpXRWYUjwpiJGHqmIHZsLuaK4KLcQWaTN3HZJpZkHR6UJxZBUq7IdipoFaPvGAUM18r4iOibeO9Sb1Vpw5Zgz8Gz9IlT0nUnK02stDUVeF59OLxpiDRJOqBKCyPiZwkRjE9iSEjGbe6wSc7wonk9j49ZaqrRnZfkYLHSbMQk4RwSi4utHva/scXlHuL34B9Iw0f9acUNLFK+06mfsn9g43wE1zAq5hMnjM9rF/LlQ34yxRzUSlIl5eQcmUAD0daZWmt4Tl92agvQBL/DFM+1GVThndpdNGT82B3cyN6ubH5+dXa9tvKr0vBw+BpsB5EwetgO/gQ9IKdgDb+NFebj5qPW9a31s/Wj+n1Gaj0qwGtdX69ReAx7kA</latexit>emission, B transition, A
8
O = o1, o2, …, on, find the most probable sequence of states Q = q1, q2, …, qn
ˆ tn
1 = argmax tn
1
P(tn
1 | w n 1 ) ≈ argmax tn
1
n
Y
i=1
P(wi | ti)P(ti | ti−1)
<latexit sha1_base64="YKrR6RB8BqJeExmb7CUdAbvYcI=">AFEXiclVLbhMxFJ2EAGV4tXQJSIYqUoOgyhQk2FSqgAUbRJDoQ6rTyPHcSaz6MbI9bSNrtvwAX8MOseUL+Aj+Ac9kWjJNWglL4zm695zr6+szTDkztv93Whea12/cXPpVnj7zt1795dXHuwalWkKO1RxpfeHxABnEnYsxz2Uw1EDnsDY/eFfm9Y9CGKfnFTlLoCzKSLGUWB8arDS+hm08JtbZfBAdSrSFMNEjQU4HzhaBHPXWS4CwYDE6KWAnbM/TcKIJdb31klFy0Snknfys1wnXyS/XBi2qxyaSXp9mp1irDf4FjW1F+KAsiq1isF7N9qyzo69iPJO+N+j+Ff64jwubwtOHmwvNbd6JYLzYOoAmtBtXqDlabGsaKZAGkpJ8YcRN3U9h3RlEOeYgzAymhR2QEBx5KIsD0XemYHLV9JEaJ0v6TFpXRWYUjwpiJGHqmIHZsLuaK4KLcQWaTN3HZJpZkHR6UJxZBUq7IdipoFaPvGAUM18r4iOibeO9Sb1Vpw5Zgz8Gz9IlT0nUnK02stDUVeF59OLxpiDRJOqBKCyPiZwkRjE9iSEjGbe6wSc7wonk9j49ZaqrRnZfkYLHSbMQk4RwSi4utHva/scXlHuL34B9Iw0f9acUNLFK+06mfsn9g43wE1zAq5hMnjM9rF/LlQ34yxRzUSlIl5eQcmUAD0daZWmt4Tl92agvQBL/DFM+1GVThndpdNGT82B3cyN6ubH5+dXa9tvKr0vBw+BpsB5EwetgO/gQ9IKdgDb+NFebj5qPW9a31s/Wj+n1Gaj0qwGtdX69ReAx7kA</latexit>emission, B transition, A How many possible choices?
9
Slide credit: Noah Smith
10
JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
11
N
i=1 vt−1(i)aijbj(ot)
<latexit sha1_base64="1AySDs6kV/dnG9LXKNVYHEkDJo=">AFQnicnVLbtQwFE2HAcrwamGJhAxVpQmCalKQYFOpgi7YtAwSfUj1NHIcZ8atY0e2M+3I8nfxA/wEv8AOsWB40lL05mywFKco3vPuS/fpGBU6V7v+0LrRvmrduLdzp3791/8HBp+dGeEqXEZBcLJuRBghRhlJNdTUjB4UkKE8Y2U9OPlT+/TGRigr+RU8KMsjRkNOMYqSdKV5e+NpZhSOkjbZxdMTBoBIDnN0FhtdGSzodz0AMKcpOK1g2FmdpcFMImz6Xc/wXO8Ia3loz32hnSe/XthZrX3gktPpi0KMwDdncaGbkT2iFdEWrOoF/6b7VjnQ19FdnwP2bxN/bVgVxfGZiXehzr7nFYJfVRvG5n7MI5b5eGyJmObRKbY9sVsQ7jpZXeWs8fMAuiGqwE9enHy0JU4HLnHCNGVLqMOoVemCQ1BQzYjuwVKRA+AQNyaGDHOVEDYxfMQtWnSUFmZDu4xp462WFQblSkzxzBzpkbrq4zfIelzt4NDOVFqQnH0RZyYAWoNpXkFJsGYTBxCW1NUK8Ai5XdNuq917XUozImxMdLMRnA+Mynz2RklJbpvis2mjHSgJ6dY5Dni6QsDM5RTNklJhkqmrYEqO8fz5vUyHdNC1aO7CMmIhkLSIeWIMZJpWF1Ns/uNPR3B24R90CSbLuqPxVEIi2kq2S6X9Y92BA+q5bE/otJ+QXTwWZbxhfgmqnmIgrCjfUQM6EITIZSlEWj4Bm9L9QFQJl7himfNGVThtvS6OpOzoK9bXo9dr65zcrm+/rfV0MngTPg24QBW+DzeBj0A92A9x62tpqbd2t/aP9o/27+m1NZCrXkcNE79x/ES8qx</latexit>JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
11
N
i=1 vt−1(i)aijbj(ot)
<latexit sha1_base64="1AySDs6kV/dnG9LXKNVYHEkDJo=">AFQnicnVLbtQwFE2HAcrwamGJhAxVpQmCalKQYFOpgi7YtAwSfUj1NHIcZ8atY0e2M+3I8nfxA/wEv8AOsWB40lL05mywFKco3vPuS/fpGBU6V7v+0LrRvmrduLdzp3791/8HBp+dGeEqXEZBcLJuRBghRhlJNdTUjB4UkKE8Y2U9OPlT+/TGRigr+RU8KMsjRkNOMYqSdKV5e+NpZhSOkjbZxdMTBoBIDnN0FhtdGSzodz0AMKcpOK1g2FmdpcFMImz6Xc/wXO8Ia3loz32hnSe/XthZrX3gktPpi0KMwDdncaGbkT2iFdEWrOoF/6b7VjnQ19FdnwP2bxN/bVgVxfGZiXehzr7nFYJfVRvG5n7MI5b5eGyJmObRKbY9sVsQ7jpZXeWs8fMAuiGqwE9enHy0JU4HLnHCNGVLqMOoVemCQ1BQzYjuwVKRA+AQNyaGDHOVEDYxfMQtWnSUFmZDu4xp462WFQblSkzxzBzpkbrq4zfIelzt4NDOVFqQnH0RZyYAWoNpXkFJsGYTBxCW1NUK8Ai5XdNuq917XUozImxMdLMRnA+Mynz2RklJbpvis2mjHSgJ6dY5Dni6QsDM5RTNklJhkqmrYEqO8fz5vUyHdNC1aO7CMmIhkLSIeWIMZJpWF1Ns/uNPR3B24R90CSbLuqPxVEIi2kq2S6X9Y92BA+q5bE/otJ+QXTwWZbxhfgmqnmIgrCjfUQM6EITIZSlEWj4Bm9L9QFQJl7himfNGVThtvS6OpOzoK9bXo9dr65zcrm+/rfV0MngTPg24QBW+DzeBj0A92A9x62tpqbd2t/aP9o/27+m1NZCrXkcNE79x/ES8qx</latexit>previous Viterbi path probability
JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
11
N
i=1 vt−1(i)aijbj(ot)
<latexit sha1_base64="1AySDs6kV/dnG9LXKNVYHEkDJo=">AFQnicnVLbtQwFE2HAcrwamGJhAxVpQmCalKQYFOpgi7YtAwSfUj1NHIcZ8atY0e2M+3I8nfxA/wEv8AOsWB40lL05mywFKco3vPuS/fpGBU6V7v+0LrRvmrduLdzp3791/8HBp+dGeEqXEZBcLJuRBghRhlJNdTUjB4UkKE8Y2U9OPlT+/TGRigr+RU8KMsjRkNOMYqSdKV5e+NpZhSOkjbZxdMTBoBIDnN0FhtdGSzodz0AMKcpOK1g2FmdpcFMImz6Xc/wXO8Ia3loz32hnSe/XthZrX3gktPpi0KMwDdncaGbkT2iFdEWrOoF/6b7VjnQ19FdnwP2bxN/bVgVxfGZiXehzr7nFYJfVRvG5n7MI5b5eGyJmObRKbY9sVsQ7jpZXeWs8fMAuiGqwE9enHy0JU4HLnHCNGVLqMOoVemCQ1BQzYjuwVKRA+AQNyaGDHOVEDYxfMQtWnSUFmZDu4xp462WFQblSkzxzBzpkbrq4zfIelzt4NDOVFqQnH0RZyYAWoNpXkFJsGYTBxCW1NUK8Ai5XdNuq917XUozImxMdLMRnA+Mynz2RklJbpvis2mjHSgJ6dY5Dni6QsDM5RTNklJhkqmrYEqO8fz5vUyHdNC1aO7CMmIhkLSIeWIMZJpWF1Ns/uNPR3B24R90CSbLuqPxVEIi2kq2S6X9Y92BA+q5bE/otJ+QXTwWZbxhfgmqnmIgrCjfUQM6EITIZSlEWj4Bm9L9QFQJl7himfNGVThtvS6OpOzoK9bXo9dr65zcrm+/rfV0MngTPg24QBW+DzeBj0A92A9x62tpqbd2t/aP9o/27+m1NZCrXkcNE79x/ES8qx</latexit>transition probability previous Viterbi path probability
JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
11
N
i=1 vt−1(i)aijbj(ot)
<latexit sha1_base64="1AySDs6kV/dnG9LXKNVYHEkDJo=">AFQnicnVLbtQwFE2HAcrwamGJhAxVpQmCalKQYFOpgi7YtAwSfUj1NHIcZ8atY0e2M+3I8nfxA/wEv8AOsWB40lL05mywFKco3vPuS/fpGBU6V7v+0LrRvmrduLdzp3791/8HBp+dGeEqXEZBcLJuRBghRhlJNdTUjB4UkKE8Y2U9OPlT+/TGRigr+RU8KMsjRkNOMYqSdKV5e+NpZhSOkjbZxdMTBoBIDnN0FhtdGSzodz0AMKcpOK1g2FmdpcFMImz6Xc/wXO8Ia3loz32hnSe/XthZrX3gktPpi0KMwDdncaGbkT2iFdEWrOoF/6b7VjnQ19FdnwP2bxN/bVgVxfGZiXehzr7nFYJfVRvG5n7MI5b5eGyJmObRKbY9sVsQ7jpZXeWs8fMAuiGqwE9enHy0JU4HLnHCNGVLqMOoVemCQ1BQzYjuwVKRA+AQNyaGDHOVEDYxfMQtWnSUFmZDu4xp462WFQblSkzxzBzpkbrq4zfIelzt4NDOVFqQnH0RZyYAWoNpXkFJsGYTBxCW1NUK8Ai5XdNuq917XUozImxMdLMRnA+Mynz2RklJbpvis2mjHSgJ6dY5Dni6QsDM5RTNklJhkqmrYEqO8fz5vUyHdNC1aO7CMmIhkLSIeWIMZJpWF1Ns/uNPR3B24R90CSbLuqPxVEIi2kq2S6X9Y92BA+q5bE/otJ+QXTwWZbxhfgmqnmIgrCjfUQM6EITIZSlEWj4Bm9L9QFQJl7himfNGVThtvS6OpOzoK9bXo9dr65zcrm+/rfV0MngTPg24QBW+DzeBj0A92A9x62tpqbd2t/aP9o/27+m1NZCrXkcNE79x/ES8qx</latexit>transition probability state observation likelihood previous Viterbi path probability
JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
function VITERBI(observations of len T,state-graph of len N) returns best-path, path-prob create a path probability matrix viterbi[N,T] for each state s from 1 to N do ; initialization step viterbi[s,1] πs ⇤ bs(o1) backpointer[s,1] 0 for each time step t from 2 to T do ; recursion step for each state s from 1 to N do viterbi[s,t]
N
max
s0=1 viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot)
backpointer[s,t]
N
argmax
s0=1
viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot) bestpathprob
N
max
s=1
viterbi[s,T] ; termination step bestpathpointer
N
argmax
s=1
viterbi[s,T] ; termination step bestpath the path starting at state bestpathpointer, that follows backpointer[] to states back in time return bestpath, bestpathprob
12
function VITERBI(observations of len T,state-graph of len N) returns best-path, path-prob create a path probability matrix viterbi[N,T] for each state s from 1 to N do ; initialization step viterbi[s,1] πs ⇤ bs(o1) backpointer[s,1] 0 for each time step t from 2 to T do ; recursion step for each state s from 1 to N do viterbi[s,t]
N
max
s0=1 viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot)
backpointer[s,t]
N
argmax
s0=1
viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot) bestpathprob
N
max
s=1
viterbi[s,T] ; termination step bestpathpointer
N
argmax
s=1
viterbi[s,T] ; termination step bestpath the path starting at state bestpathpointer, that follows backpointer[] to states back in time return bestpath, bestpathprob
12
initialization
function VITERBI(observations of len T,state-graph of len N) returns best-path, path-prob create a path probability matrix viterbi[N,T] for each state s from 1 to N do ; initialization step viterbi[s,1] πs ⇤ bs(o1) backpointer[s,1] 0 for each time step t from 2 to T do ; recursion step for each state s from 1 to N do viterbi[s,t]
N
max
s0=1 viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot)
backpointer[s,t]
N
argmax
s0=1
viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot) bestpathprob
N
max
s=1
viterbi[s,T] ; termination step bestpathpointer
N
argmax
s=1
viterbi[s,T] ; termination step bestpath the path starting at state bestpathpointer, that follows backpointer[] to states back in time return bestpath, bestpathprob
12
initialization recursion
function VITERBI(observations of len T,state-graph of len N) returns best-path, path-prob create a path probability matrix viterbi[N,T] for each state s from 1 to N do ; initialization step viterbi[s,1] πs ⇤ bs(o1) backpointer[s,1] 0 for each time step t from 2 to T do ; recursion step for each state s from 1 to N do viterbi[s,t]
N
max
s0=1 viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot)
backpointer[s,t]
N
argmax
s0=1
viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot) bestpathprob
N
max
s=1
viterbi[s,T] ; termination step bestpathpointer
N
argmax
s=1
viterbi[s,T] ; termination step bestpath the path starting at state bestpathpointer, that follows backpointer[] to states back in time return bestpath, bestpathprob
12
initialization recursion
vt(j) =
N
max
i=1 vt−1(i)aijbj(ot)
<latexit sha1_base64="1AySDs6kV/dnG9LXKNVYHEkDJo=">AFQnicnVLbtQwFE2HAcrwamGJhAxVpQmCalKQYFOpgi7YtAwSfUj1NHIcZ8atY0e2M+3I8nfxA/wEv8AOsWB40lL05mywFKco3vPuS/fpGBU6V7v+0LrRvmrduLdzp3791/8HBp+dGeEqXEZBcLJuRBghRhlJNdTUjB4UkKE8Y2U9OPlT+/TGRigr+RU8KMsjRkNOMYqSdKV5e+NpZhSOkjbZxdMTBoBIDnN0FhtdGSzodz0AMKcpOK1g2FmdpcFMImz6Xc/wXO8Ia3loz32hnSe/XthZrX3gktPpi0KMwDdncaGbkT2iFdEWrOoF/6b7VjnQ19FdnwP2bxN/bVgVxfGZiXehzr7nFYJfVRvG5n7MI5b5eGyJmObRKbY9sVsQ7jpZXeWs8fMAuiGqwE9enHy0JU4HLnHCNGVLqMOoVemCQ1BQzYjuwVKRA+AQNyaGDHOVEDYxfMQtWnSUFmZDu4xp462WFQblSkzxzBzpkbrq4zfIelzt4NDOVFqQnH0RZyYAWoNpXkFJsGYTBxCW1NUK8Ai5XdNuq917XUozImxMdLMRnA+Mynz2RklJbpvis2mjHSgJ6dY5Dni6QsDM5RTNklJhkqmrYEqO8fz5vUyHdNC1aO7CMmIhkLSIeWIMZJpWF1Ns/uNPR3B24R90CSbLuqPxVEIi2kq2S6X9Y92BA+q5bE/otJ+QXTwWZbxhfgmqnmIgrCjfUQM6EITIZSlEWj4Bm9L9QFQJl7himfNGVThtvS6OpOzoK9bXo9dr65zcrm+/rfV0MngTPg24QBW+DzeBj0A92A9x62tpqbd2t/aP9o/27+m1NZCrXkcNE79x/ES8qx</latexit>function VITERBI(observations of len T,state-graph of len N) returns best-path, path-prob create a path probability matrix viterbi[N,T] for each state s from 1 to N do ; initialization step viterbi[s,1] πs ⇤ bs(o1) backpointer[s,1] 0 for each time step t from 2 to T do ; recursion step for each state s from 1 to N do viterbi[s,t]
N
max
s0=1 viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot)
backpointer[s,t]
N
argmax
s0=1
viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot) bestpathprob
N
max
s=1
viterbi[s,T] ; termination step bestpathpointer
N
argmax
s=1
viterbi[s,T] ; termination step bestpath the path starting at state bestpathpointer, that follows backpointer[] to states back in time return bestpath, bestpathprob
12
initialization recursion
vt(j) =
N
max
i=1 vt−1(i)aijbj(ot)
<latexit sha1_base64="1AySDs6kV/dnG9LXKNVYHEkDJo=">AFQnicnVLbtQwFE2HAcrwamGJhAxVpQmCalKQYFOpgi7YtAwSfUj1NHIcZ8atY0e2M+3I8nfxA/wEv8AOsWB40lL05mywFKco3vPuS/fpGBU6V7v+0LrRvmrduLdzp3791/8HBp+dGeEqXEZBcLJuRBghRhlJNdTUjB4UkKE8Y2U9OPlT+/TGRigr+RU8KMsjRkNOMYqSdKV5e+NpZhSOkjbZxdMTBoBIDnN0FhtdGSzodz0AMKcpOK1g2FmdpcFMImz6Xc/wXO8Ia3loz32hnSe/XthZrX3gktPpi0KMwDdncaGbkT2iFdEWrOoF/6b7VjnQ19FdnwP2bxN/bVgVxfGZiXehzr7nFYJfVRvG5n7MI5b5eGyJmObRKbY9sVsQ7jpZXeWs8fMAuiGqwE9enHy0JU4HLnHCNGVLqMOoVemCQ1BQzYjuwVKRA+AQNyaGDHOVEDYxfMQtWnSUFmZDu4xp462WFQblSkzxzBzpkbrq4zfIelzt4NDOVFqQnH0RZyYAWoNpXkFJsGYTBxCW1NUK8Ai5XdNuq917XUozImxMdLMRnA+Mynz2RklJbpvis2mjHSgJ6dY5Dni6QsDM5RTNklJhkqmrYEqO8fz5vUyHdNC1aO7CMmIhkLSIeWIMZJpWF1Ns/uNPR3B24R90CSbLuqPxVEIi2kq2S6X9Y92BA+q5bE/otJ+QXTwWZbxhfgmqnmIgrCjfUQM6EITIZSlEWj4Bm9L9QFQJl7himfNGVThtvS6OpOzoK9bXo9dr65zcrm+/rfV0MngTPg24QBW+DzeBj0A92A9x62tpqbd2t/aP9o/27+m1NZCrXkcNE79x/ES8qx</latexit>termination
13
JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
N
i=1 vt−1(i)aijbj(ot)
<latexit sha1_base64="1AySDs6kV/dnG9LXKNVYHEkDJo=">AFQnicnVLbtQwFE2HAcrwamGJhAxVpQmCalKQYFOpgi7YtAwSfUj1NHIcZ8atY0e2M+3I8nfxA/wEv8AOsWB40lL05mywFKco3vPuS/fpGBU6V7v+0LrRvmrduLdzp3791/8HBp+dGeEqXEZBcLJuRBghRhlJNdTUjB4UkKE8Y2U9OPlT+/TGRigr+RU8KMsjRkNOMYqSdKV5e+NpZhSOkjbZxdMTBoBIDnN0FhtdGSzodz0AMKcpOK1g2FmdpcFMImz6Xc/wXO8Ia3loz32hnSe/XthZrX3gktPpi0KMwDdncaGbkT2iFdEWrOoF/6b7VjnQ19FdnwP2bxN/bVgVxfGZiXehzr7nFYJfVRvG5n7MI5b5eGyJmObRKbY9sVsQ7jpZXeWs8fMAuiGqwE9enHy0JU4HLnHCNGVLqMOoVemCQ1BQzYjuwVKRA+AQNyaGDHOVEDYxfMQtWnSUFmZDu4xp462WFQblSkzxzBzpkbrq4zfIelzt4NDOVFqQnH0RZyYAWoNpXkFJsGYTBxCW1NUK8Ai5XdNuq917XUozImxMdLMRnA+Mynz2RklJbpvis2mjHSgJ6dY5Dni6QsDM5RTNklJhkqmrYEqO8fz5vUyHdNC1aO7CMmIhkLSIeWIMZJpWF1Ns/uNPR3B24R90CSbLuqPxVEIi2kq2S6X9Y92BA+q5bE/otJ+QXTwWZbxhfgmqnmIgrCjfUQM6EITIZSlEWj4Bm9L9QFQJl7himfNGVThtvS6OpOzoK9bXo9dr65zcrm+/rfV0MngTPg24QBW+DzeBj0A92A9x62tpqbd2t/aP9o/27+m1NZCrXkcNE79x/ES8qx</latexit>NNP MD VB JJ NN RB DT <s> 0.2767 0.0006 0.0031 0.0453 0.0449 0.0510 0.2026 NNP 0.3777 0.0110 0.0009 0.0084 0.0584 0.0090 0.0025 MD 0.0008 0.0002 0.7968 0.0005 0.0008 0.1698 0.0041 VB 0.0322 0.0005 0.0050 0.0837 0.0615 0.0514 0.2231 JJ 0.0366 0.0004 0.0001 0.0733 0.4509 0.0036 0.0036 NN 0.0096 0.0176 0.0014 0.0086 0.1216 0.0177 0.0068 RB 0.0068 0.0102 0.1011 0.1012 0.0120 0.0728 0.0479 DT 0.1147 0.0021 0.0002 0.2157 0.4744 0.0102 0.0017
Figure 8.7 The A transition probabilities P t t computed from the WSJ corpus without
Janet will back the bill NNP 0.000032 0 0.000048 0 MD 0.308431 0 VB 0.000028 0.000672 0 0.000028 JJ 0.000340 0 NN 0.000200 0.000223 0 0.002337 RB 0.010446 0 DT 0.506099 0
Figure 8.8 Observation likelihoods B computed from the WSJ corpus without smoothing,
13
JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
N
i=1 vt−1(i)aijbj(ot)
<latexit sha1_base64="1AySDs6kV/dnG9LXKNVYHEkDJo=">AFQnicnVLbtQwFE2HAcrwamGJhAxVpQmCalKQYFOpgi7YtAwSfUj1NHIcZ8atY0e2M+3I8nfxA/wEv8AOsWB40lL05mywFKco3vPuS/fpGBU6V7v+0LrRvmrduLdzp3791/8HBp+dGeEqXEZBcLJuRBghRhlJNdTUjB4UkKE8Y2U9OPlT+/TGRigr+RU8KMsjRkNOMYqSdKV5e+NpZhSOkjbZxdMTBoBIDnN0FhtdGSzodz0AMKcpOK1g2FmdpcFMImz6Xc/wXO8Ia3loz32hnSe/XthZrX3gktPpi0KMwDdncaGbkT2iFdEWrOoF/6b7VjnQ19FdnwP2bxN/bVgVxfGZiXehzr7nFYJfVRvG5n7MI5b5eGyJmObRKbY9sVsQ7jpZXeWs8fMAuiGqwE9enHy0JU4HLnHCNGVLqMOoVemCQ1BQzYjuwVKRA+AQNyaGDHOVEDYxfMQtWnSUFmZDu4xp462WFQblSkzxzBzpkbrq4zfIelzt4NDOVFqQnH0RZyYAWoNpXkFJsGYTBxCW1NUK8Ai5XdNuq917XUozImxMdLMRnA+Mynz2RklJbpvis2mjHSgJ6dY5Dni6QsDM5RTNklJhkqmrYEqO8fz5vUyHdNC1aO7CMmIhkLSIeWIMZJpWF1Ns/uNPR3B24R90CSbLuqPxVEIi2kq2S6X9Y92BA+q5bE/otJ+QXTwWZbxhfgmqnmIgrCjfUQM6EITIZSlEWj4Bm9L9QFQJl7himfNGVThtvS6OpOzoK9bXo9dr65zcrm+/rfV0MngTPg24QBW+DzeBj0A92A9x62tpqbd2t/aP9o/27+m1NZCrXkcNE79x/ES8qx</latexit>NNP MD VB JJ NN RB DT <s> 0.2767 0.0006 0.0031 0.0453 0.0449 0.0510 0.2026 NNP 0.3777 0.0110 0.0009 0.0084 0.0584 0.0090 0.0025 MD 0.0008 0.0002 0.7968 0.0005 0.0008 0.1698 0.0041 VB 0.0322 0.0005 0.0050 0.0837 0.0615 0.0514 0.2231 JJ 0.0366 0.0004 0.0001 0.0733 0.4509 0.0036 0.0036 NN 0.0096 0.0176 0.0014 0.0086 0.1216 0.0177 0.0068 RB 0.0068 0.0102 0.1011 0.1012 0.0120 0.0728 0.0479 DT 0.1147 0.0021 0.0002 0.2157 0.4744 0.0102 0.0017
Figure 8.7 The A transition probabilities P t t computed from the WSJ corpus without
Janet will back the bill NNP 0.000032 0 0.000048 0 MD 0.308431 0 VB 0.000028 0.000672 0 0.000028 JJ 0.000340 0 NN 0.000200 0.000223 0 0.002337 RB 0.010446 0 DT 0.506099 0
Figure 8.8 Observation likelihoods B computed from the WSJ corpus without smoothing,
A
13
JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
N
i=1 vt−1(i)aijbj(ot)
<latexit sha1_base64="1AySDs6kV/dnG9LXKNVYHEkDJo=">AFQnicnVLbtQwFE2HAcrwamGJhAxVpQmCalKQYFOpgi7YtAwSfUj1NHIcZ8atY0e2M+3I8nfxA/wEv8AOsWB40lL05mywFKco3vPuS/fpGBU6V7v+0LrRvmrduLdzp3791/8HBp+dGeEqXEZBcLJuRBghRhlJNdTUjB4UkKE8Y2U9OPlT+/TGRigr+RU8KMsjRkNOMYqSdKV5e+NpZhSOkjbZxdMTBoBIDnN0FhtdGSzodz0AMKcpOK1g2FmdpcFMImz6Xc/wXO8Ia3loz32hnSe/XthZrX3gktPpi0KMwDdncaGbkT2iFdEWrOoF/6b7VjnQ19FdnwP2bxN/bVgVxfGZiXehzr7nFYJfVRvG5n7MI5b5eGyJmObRKbY9sVsQ7jpZXeWs8fMAuiGqwE9enHy0JU4HLnHCNGVLqMOoVemCQ1BQzYjuwVKRA+AQNyaGDHOVEDYxfMQtWnSUFmZDu4xp462WFQblSkzxzBzpkbrq4zfIelzt4NDOVFqQnH0RZyYAWoNpXkFJsGYTBxCW1NUK8Ai5XdNuq917XUozImxMdLMRnA+Mynz2RklJbpvis2mjHSgJ6dY5Dni6QsDM5RTNklJhkqmrYEqO8fz5vUyHdNC1aO7CMmIhkLSIeWIMZJpWF1Ns/uNPR3B24R90CSbLuqPxVEIi2kq2S6X9Y92BA+q5bE/otJ+QXTwWZbxhfgmqnmIgrCjfUQM6EITIZSlEWj4Bm9L9QFQJl7himfNGVThtvS6OpOzoK9bXo9dr65zcrm+/rfV0MngTPg24QBW+DzeBj0A92A9x62tpqbd2t/aP9o/27+m1NZCrXkcNE79x/ES8qx</latexit>NNP MD VB JJ NN RB DT <s> 0.2767 0.0006 0.0031 0.0453 0.0449 0.0510 0.2026 NNP 0.3777 0.0110 0.0009 0.0084 0.0584 0.0090 0.0025 MD 0.0008 0.0002 0.7968 0.0005 0.0008 0.1698 0.0041 VB 0.0322 0.0005 0.0050 0.0837 0.0615 0.0514 0.2231 JJ 0.0366 0.0004 0.0001 0.0733 0.4509 0.0036 0.0036 NN 0.0096 0.0176 0.0014 0.0086 0.1216 0.0177 0.0068 RB 0.0068 0.0102 0.1011 0.1012 0.0120 0.0728 0.0479 DT 0.1147 0.0021 0.0002 0.2157 0.4744 0.0102 0.0017
Figure 8.7 The A transition probabilities P t t computed from the WSJ corpus without
Janet will back the bill NNP 0.000032 0 0.000048 0 MD 0.308431 0 VB 0.000028 0.000672 0 0.000028 JJ 0.000340 0 NN 0.000200 0.000223 0 0.002337 RB 0.010446 0 DT 0.506099 0
Figure 8.8 Observation likelihoods B computed from the WSJ corpus without smoothing,
A B
14
π
P(NNP|start) = .28
* P ( M D | M D ) =
* P ( M D | N N P ) . 9 * . 1 = . 9 e
v1(2)= .0006 x 0 = v1(1) = .28* .000032 = .000009
t
MD q2 q1
Janet bill will
back
VB JJ
v1(3)= .0031 x 0 = 0 v1(4)= . 045*0=0
* P ( M D | V B ) = * P(MD|JJ) = 0 P(VB|start) = .0031 P ( J J | s t a r t ) = . 4 5
backtrace
q3 q4
the
NN q5 RB q6 DT q7
v2(2) = max * .308 = 2.772e-8
v2(5)= max * .0002
= .0000000001
v2(3)=
max * .000028 = 2.5e-13
v3(6)=
max * .0104
v3(5)=
max * .
000223 v3(4)=
max * .00034
v3(3)=
max * .00067
v1(5) v1(6) v1(7) v2(1) v2(4) v2(6) v2(7)
backtrace
* P ( R B | N N )
* P(NN|NN)
start start start start start
NNP
P(MD|start) = .0006
function VITERBI(observations of len T,state-graph of len N) returns best-path, path-prob create a path probability matrix viterbi[N,T] for each state s from 1 to N do ; initialization step viterbi[s,1] πs ⇤ bs(o1) backpointer[s,1] 0 for each time step t from 2 to T do ; recursion step for each state s from 1 to N do viterbi[s,t]
N
max
s0=1 viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot)
backpointer[s,t]
N
argmax
s0=1
viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot) bestpathprob
N
max
s=1
viterbi[s,T] ; termination step bestpathpointer
N
argmax
s=1
viterbi[s,T] ; termination step bestpath the path starting at state bestpathpointer, that follows backpointer[] to states back in time return bestpath, bestpathprob
15
function VITERBI(observations of len T,state-graph of len N) returns best-path, path-prob create a path probability matrix viterbi[N,T] for each state s from 1 to N do ; initialization step viterbi[s,1] πs ⇤ bs(o1) backpointer[s,1] 0 for each time step t from 2 to T do ; recursion step for each state s from 1 to N do viterbi[s,t]
N
max
s0=1 viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot)
backpointer[s,t]
N
argmax
s0=1
viterbi[s0,t 1] ⇤ as0,s ⇤ bs(ot) bestpathprob
N
max
s=1
viterbi[s,T] ; termination step bestpathpointer
N
argmax
s=1
viterbi[s,T] ; termination step bestpath the path starting at state bestpathpointer, that follows backpointer[] to states back in time return bestpath, bestpathprob
15
Computational complexity in N and T?
16
JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
17
Forward Viterbi Forward-backward; Baum-Welch
17
Forward Viterbi Forward-backward; Baum-Welch
18
JJ NNP NNP NNP MD MD MD MD VB VB JJ JJ JJ NN NN RB RB RB RB DT DT DT DT NNP
Janet will back the bill
NN VB MD NN VB JJ RB NNP DT NN VB
αt(j) =
N
X
i=1
αt−1(i)aijbj(ot)
<latexit sha1_base64="tmjmwISJ3oCxl7E7JFkWxomW0Xs=">AFfXicnVNb9MwFM7KCiPcNnjkxTBVatComoE0XiZNwAMvQJHYBc1d5DhO681xItvpVlkWr/AT+Rv8AXDcdCzrtgcs1f10zvedm0/iglGp+v1fS61by+3bd1bu+vfuP3j4aHXt8Z7MS4HJLs5ZLg5iJAmjnOwqhg5KARBWczIfnzyrvLvT4iQNOdf1bQgwyNOE0pRsqaorWl34HjpHSykThEQfbACIxytBZpFVlMGDQdQDAjCbgtIKB31mkwVQgrAdx3Bc5whqeWDmvsBcJb9e6HdqH7jgtPqiEPkZgPZOIk23Q3PEKyKtWdQJb2Zb1jyopi9DE/zHLP7FvjyQ6ysDV6aeRKp7HFRZXRgn/DSx8ay7SwNkTcmjvSx6eaRCnyIWDFG5yJZnNR7blOGa2u93t9d8AiCGuw7tVnEK21BExyXGaEK8yQlIdhv1BDjYSimBHjw1KSAuETNCKHFnKUETnUbjsN6FhLAtJc2B9XwFkvKjTKpJxmsWVmSI3lZV9lvMp3WKr0zVBTXpSKcDxLlJYMqBxUqw4SKghWbGoBwoLaWgEeI7umyn4Q9qkvpBkTNiGq2QjOhlqmLnujpDgzTfHZrFEfCsLJKc6zDPHkhYpyibJiRFJVNGQ5nO8VXz2kgmtJD16M5DMqJgLuiIcsQYSRWsrqbZ/o0VdLcP3xP7QIJ8tFV/LohAKhe2ktlqGvtgI/isWi9zE5Pyc6aFzba0K8A2U80lLwjXxkHMcklgPBJ5WTQKXtC7Qm0AlNpnmPFJUzZj2C0NL+/kItjb7IWveptfXq/vK3dcV76j3ul7obXk73gdv4O16uPWt9b31o/Vz+U+7095o92bU1lKteI1TnvrL6qb3bA=</latexit>19
19
19
19
19
19
like logistic regression to predict rather than count.
19
like logistic regression to predict rather than count.
20
assigned to the previous word as a feature (along with any other useful features)! P(ti | wi, ti1) = exp(θ · f (ti, w i+j
ij , ti1 ik))
P
t02Y
exp(θ · f (t0, w i+j
ij , ti1 ik))
<latexit sha1_base64="Rl61uDKpuzo1hvhob8vcbSy82E=">AGl3icnVTbhMxEF0aCBAKtPCEeDFUEQmEKgEkeEgQIgXIEi9gOp05fV6E6fei2xv2sjyh/AKX8XfMN7dtkmTFglL8U5mzhmfGc9ukAmudLf759JK7fKV+tVr1xs3Vm/eur2fmdHpbmkbJumIpXfA6KY4Anb1lwL9j2TjMSBYLvBwXsX350wqXiabOlpxgYxGSY84pRocPnrtdVGE4+INtr6vf0EvUaYyGFMjnyjncOifqswEI5iA6d2W40F2E4koSafqtAFNgi0K7obXsca9tl9POJjWYVQzNB4GeZTI8Qhj30DX/ds/uJA/IKxQvixWhAHSc1/GnPtv+jF6e5zbkfGVo6dETX7fGbXdqkaYgfplAPgi3eJuAa2wD34xtK/W10pENiInLJXHx6wqcgHVlblZ0rcKvXyGWkguXOq7yAy1HcPWZHWQvrEdMEYRqmGkWO0gE4IMd23/AnY1sRD9xfR4fBcMqx4DHXCiQ/whzGgOgRJcL8sHZp2kedfyR1lz8t1eIgNof2jMz+iLfKQKcATC0Isf7aRnezWy0aPQqY8OrVt9fX5E4TGkes0RTQZTa63UzPTBEak4Fsw2cK5YRekCGbA/MhMRMDUzx3lrUBE+IolTCL9Go8M4yDImVmsYBIF1D1NmYcy6L7eU6ejUwPMlyzRJaHhTlAukUuY8ACrlkVIspGIRKDloRHRHojoZPBUzHzDEjJiZMzxdC4FRUXH6nKQgtvPko7LQBpYsYc0jWOShI8NjkjMxTRkEcmFdgMQHdvL+tUJzxTVetOUgqmcSr5kCdECBZp7LZ5NzxGhd7A39gcEGSfQbVXzMmiU4lKCmn38KFDfED9+LZi5A8OUGCOV+WKQRAMa4vacYSY8sxFqliOBjKNM/mBC/wC6GQgERwDSWezdNKBExp7+xMLho7zZ7zefXux8fZdNa/XvPveQ6/l9byX3lvk9f3tj1aU7WftV+13/V79Tf1j/VPJXTlUsW5682t+re/621ALA=</latexit>ˆ T = argmax
T
Y
i
P(ti | w i+j
i−j , ti−1 i−k)))
<latexit sha1_base64="n95aMDbgXepW8IPRq1g025JqVU=">AGqXicnVTbhMxEN02Eq4tfCIhAxV1ISGKilI8FKpAh54AVKpN6jTldfrTZx6L/J6c5HlR76GV/gY/gbuwlJk7YSK8U7mTlnfGY8Xi9hNBXN5p+V1dKt2+U7a3cr9+4/ePhofePxcRpnHJMjHLOYn3oJYxG5EhQwchpwgkKPUZOvIsPJn4yIDylcXQoxgnphKgb0YBiJLTL3Sg9q1RhDwkplNs6j8AegIh3QzRypTAOBdo1awAYUh8MjVmvVBdhMOAIy3bNIizWBuoFva4msbpaRr+aWKkWMTAT1Pwk4fEIQL36rqR7LXUeGSAtUNQSr0dr1CSpK9aqv4fvfiX+3JDrlYGlm49cEWtXze72jSW+GWg8+lwjdaRdvWV58q+qsWuMFoRS3poykqzcMIqIldSbZWHaqbCw1wunVE2NMr6lzS7b5q5EIvzF8jt160d4qljWkpJq2dBkhGSQ2KHhEIQOzHAgSG0rghtZKmFshoSEWqi9iCVA8GEj2MmPym1NK0W40bkhq941wt9EI5VJdktnu0lgcaFjBWohy1zebO037gEWjVRibTvG03Y1VDv0YZyGJBGYoTc9azUR0JOKCYkZUBWYpSRC+QF1yps0IhSTtSHuTFahqjw+CmOtfJID1zjIkCtN0HoaRqSXo4Z57LYWSaCdx1JoyQTJML5RkHGgIiB+SwAn3KCBRtrA2FOtVaAe0h3R+iPhx61mW16hA2ImC8Ehx2ZBnb3OUleqObJo7zQCuQkIkMchyGK/JcSBikbOyTAGVMmAEIJvayfjX8AU3SonXTlIwIGHPapRFijAQCmXerV89Ae1agR+JPiBOPmvVXxPCkYi5VpJfCKUPrAufm6uorkPSaIrU5nxZ0grQxZi+xAmJpMrHmMUpgV6Xx1kyJ3iBb4XqBCjQx5DjyTwtR+gpbV2eyUXjeHen9Xpn9+DN5v7Yl7XnKfOC6fmtJy3zr7zyWk7Rw4u/Sj9LP0q/S5vlw/Kp+XvOXR1peA8ceaeMv4L5V1HbA=</latexit>20
assigned to the previous word as a feature (along with any other useful features)!
Janet will back the bill <s>
P(ti | wi, ti1) = exp(θ · f (ti, w i+j
ij , ti1 ik))
P
t02Y
exp(θ · f (t0, w i+j
ij , ti1 ik))
<latexit sha1_base64="Rl61uDKpuzo1hvhob8vcbSy82E=">AGl3icnVTbhMxEF0aCBAKtPCEeDFUEQmEKgEkeEgQIgXIEi9gOp05fV6E6fei2xv2sjyh/AKX8XfMN7dtkmTFglL8U5mzhmfGc9ukAmudLf759JK7fKV+tVr1xs3Vm/eur2fmdHpbmkbJumIpXfA6KY4Anb1lwL9j2TjMSBYLvBwXsX350wqXiabOlpxgYxGSY84pRocPnrtdVGE4+INtr6vf0EvUaYyGFMjnyjncOifqswEI5iA6d2W40F2E4koSafqtAFNgi0K7obXsca9tl9POJjWYVQzNB4GeZTI8Qhj30DX/ds/uJA/IKxQvixWhAHSc1/GnPtv+jF6e5zbkfGVo6dETX7fGbXdqkaYgfplAPgi3eJuAa2wD34xtK/W10pENiInLJXHx6wqcgHVlblZ0rcKvXyGWkguXOq7yAy1HcPWZHWQvrEdMEYRqmGkWO0gE4IMd23/AnY1sRD9xfR4fBcMqx4DHXCiQ/whzGgOgRJcL8sHZp2kedfyR1lz8t1eIgNof2jMz+iLfKQKcATC0Isf7aRnezWy0aPQqY8OrVt9fX5E4TGkes0RTQZTa63UzPTBEak4Fsw2cK5YRekCGbA/MhMRMDUzx3lrUBE+IolTCL9Go8M4yDImVmsYBIF1D1NmYcy6L7eU6ejUwPMlyzRJaHhTlAukUuY8ACrlkVIspGIRKDloRHRHojoZPBUzHzDEjJiZMzxdC4FRUXH6nKQgtvPko7LQBpYsYc0jWOShI8NjkjMxTRkEcmFdgMQHdvL+tUJzxTVetOUgqmcSr5kCdECBZp7LZ5NzxGhd7A39gcEGSfQbVXzMmiU4lKCmn38KFDfED9+LZi5A8OUGCOV+WKQRAMa4vacYSY8sxFqliOBjKNM/mBC/wC6GQgERwDSWezdNKBExp7+xMLho7zZ7zefXux8fZdNa/XvPveQ6/l9byX3lvk9f3tj1aU7WftV+13/V79Tf1j/VPJXTlUsW5682t+re/621ALA=</latexit>ˆ T = argmax
T
Y
i
P(ti | w i+j
i−j , ti−1 i−k)))
<latexit sha1_base64="n95aMDbgXepW8IPRq1g025JqVU=">AGqXicnVTbhMxEN02Eq4tfCIhAxV1ISGKilI8FKpAh54AVKpN6jTldfrTZx6L/J6c5HlR76GV/gY/gbuwlJk7YSK8U7mTlnfGY8Xi9hNBXN5p+V1dKt2+U7a3cr9+4/ePhofePxcRpnHJMjHLOYn3oJYxG5EhQwchpwgkKPUZOvIsPJn4yIDylcXQoxgnphKgb0YBiJLTL3Sg9q1RhDwkplNs6j8AegIh3QzRypTAOBdo1awAYUh8MjVmvVBdhMOAIy3bNIizWBuoFva4msbpaRr+aWKkWMTAT1Pwk4fEIQL36rqR7LXUeGSAtUNQSr0dr1CSpK9aqv4fvfiX+3JDrlYGlm49cEWtXze72jSW+GWg8+lwjdaRdvWV58q+qsWuMFoRS3poykqzcMIqIldSbZWHaqbCw1wunVE2NMr6lzS7b5q5EIvzF8jt160d4qljWkpJq2dBkhGSQ2KHhEIQOzHAgSG0rghtZKmFshoSEWqi9iCVA8GEj2MmPym1NK0W40bkhq941wt9EI5VJdktnu0lgcaFjBWohy1zebO037gEWjVRibTvG03Y1VDv0YZyGJBGYoTc9azUR0JOKCYkZUBWYpSRC+QF1yps0IhSTtSHuTFahqjw+CmOtfJID1zjIkCtN0HoaRqSXo4Z57LYWSaCdx1JoyQTJML5RkHGgIiB+SwAn3KCBRtrA2FOtVaAe0h3R+iPhx61mW16hA2ImC8Ehx2ZBnb3OUleqObJo7zQCuQkIkMchyGK/JcSBikbOyTAGVMmAEIJvayfjX8AU3SonXTlIwIGHPapRFijAQCmXerV89Ae1agR+JPiBOPmvVXxPCkYi5VpJfCKUPrAufm6uorkPSaIrU5nxZ0grQxZi+xAmJpMrHmMUpgV6Xx1kyJ3iBb4XqBCjQx5DjyTwtR+gpbV2eyUXjeHen9Xpn9+DN5v7Yl7XnKfOC6fmtJy3zr7zyWk7Rw4u/Sj9LP0q/S5vlw/Kp+XvOXR1peA8ceaeMv4L5V1HbA=</latexit>20
assigned to the previous word as a feature (along with any other useful features)!
Janet
VB?
will back the bill <s>
P(ti | wi, ti1) = exp(θ · f (ti, w i+j
ij , ti1 ik))
P
t02Y
exp(θ · f (t0, w i+j
ij , ti1 ik))
<latexit sha1_base64="Rl61uDKpuzo1hvhob8vcbSy82E=">AGl3icnVTbhMxEF0aCBAKtPCEeDFUEQmEKgEkeEgQIgXIEi9gOp05fV6E6fei2xv2sjyh/AKX8XfMN7dtkmTFglL8U5mzhmfGc9ukAmudLf759JK7fKV+tVr1xs3Vm/eur2fmdHpbmkbJumIpXfA6KY4Anb1lwL9j2TjMSBYLvBwXsX350wqXiabOlpxgYxGSY84pRocPnrtdVGE4+INtr6vf0EvUaYyGFMjnyjncOifqswEI5iA6d2W40F2E4koSafqtAFNgi0K7obXsca9tl9POJjWYVQzNB4GeZTI8Qhj30DX/ds/uJA/IKxQvixWhAHSc1/GnPtv+jF6e5zbkfGVo6dETX7fGbXdqkaYgfplAPgi3eJuAa2wD34xtK/W10pENiInLJXHx6wqcgHVlblZ0rcKvXyGWkguXOq7yAy1HcPWZHWQvrEdMEYRqmGkWO0gE4IMd23/AnY1sRD9xfR4fBcMqx4DHXCiQ/whzGgOgRJcL8sHZp2kedfyR1lz8t1eIgNof2jMz+iLfKQKcATC0Isf7aRnezWy0aPQqY8OrVt9fX5E4TGkes0RTQZTa63UzPTBEak4Fsw2cK5YRekCGbA/MhMRMDUzx3lrUBE+IolTCL9Go8M4yDImVmsYBIF1D1NmYcy6L7eU6ejUwPMlyzRJaHhTlAukUuY8ACrlkVIspGIRKDloRHRHojoZPBUzHzDEjJiZMzxdC4FRUXH6nKQgtvPko7LQBpYsYc0jWOShI8NjkjMxTRkEcmFdgMQHdvL+tUJzxTVetOUgqmcSr5kCdECBZp7LZ5NzxGhd7A39gcEGSfQbVXzMmiU4lKCmn38KFDfED9+LZi5A8OUGCOV+WKQRAMa4vacYSY8sxFqliOBjKNM/mBC/wC6GQgERwDSWezdNKBExp7+xMLho7zZ7zefXux8fZdNa/XvPveQ6/l9byX3lvk9f3tj1aU7WftV+13/V79Tf1j/VPJXTlUsW5682t+re/621ALA=</latexit>ˆ T = argmax
T
Y
i
P(ti | w i+j
i−j , ti−1 i−k)))
<latexit sha1_base64="n95aMDbgXepW8IPRq1g025JqVU=">AGqXicnVTbhMxEN02Eq4tfCIhAxV1ISGKilI8FKpAh54AVKpN6jTldfrTZx6L/J6c5HlR76GV/gY/gbuwlJk7YSK8U7mTlnfGY8Xi9hNBXN5p+V1dKt2+U7a3cr9+4/ePhofePxcRpnHJMjHLOYn3oJYxG5EhQwchpwgkKPUZOvIsPJn4yIDylcXQoxgnphKgb0YBiJLTL3Sg9q1RhDwkplNs6j8AegIh3QzRypTAOBdo1awAYUh8MjVmvVBdhMOAIy3bNIizWBuoFva4msbpaRr+aWKkWMTAT1Pwk4fEIQL36rqR7LXUeGSAtUNQSr0dr1CSpK9aqv4fvfiX+3JDrlYGlm49cEWtXze72jSW+GWg8+lwjdaRdvWV58q+qsWuMFoRS3poykqzcMIqIldSbZWHaqbCw1wunVE2NMr6lzS7b5q5EIvzF8jt160d4qljWkpJq2dBkhGSQ2KHhEIQOzHAgSG0rghtZKmFshoSEWqi9iCVA8GEj2MmPym1NK0W40bkhq941wt9EI5VJdktnu0lgcaFjBWohy1zebO037gEWjVRibTvG03Y1VDv0YZyGJBGYoTc9azUR0JOKCYkZUBWYpSRC+QF1yps0IhSTtSHuTFahqjw+CmOtfJID1zjIkCtN0HoaRqSXo4Z57LYWSaCdx1JoyQTJML5RkHGgIiB+SwAn3KCBRtrA2FOtVaAe0h3R+iPhx61mW16hA2ImC8Ehx2ZBnb3OUleqObJo7zQCuQkIkMchyGK/JcSBikbOyTAGVMmAEIJvayfjX8AU3SonXTlIwIGHPapRFijAQCmXerV89Ae1agR+JPiBOPmvVXxPCkYi5VpJfCKUPrAufm6uorkPSaIrU5nxZ0grQxZi+xAmJpMrHmMUpgV6Xx1kyJ3iBb4XqBCjQx5DjyTwtR+gpbV2eyUXjeHen9Xpn9+DN5v7Yl7XnKfOC6fmtJy3zr7zyWk7Rw4u/Sj9LP0q/S5vlw/Kp+XvOXR1peA8ceaeMv4L5V1HbA=</latexit>20
assigned to the previous word as a feature (along with any other useful features)!
Janet
NNP MD VB?
will back the bill <s> ti-2 ti-1 wi+1 wi wi-1 wi-2
P(ti | wi, ti1) = exp(θ · f (ti, w i+j
ij , ti1 ik))
P
t02Y
exp(θ · f (t0, w i+j
ij , ti1 ik))
<latexit sha1_base64="Rl61uDKpuzo1hvhob8vcbSy82E=">AGl3icnVTbhMxEF0aCBAKtPCEeDFUEQmEKgEkeEgQIgXIEi9gOp05fV6E6fei2xv2sjyh/AKX8XfMN7dtkmTFglL8U5mzhmfGc9ukAmudLf759JK7fKV+tVr1xs3Vm/eur2fmdHpbmkbJumIpXfA6KY4Anb1lwL9j2TjMSBYLvBwXsX350wqXiabOlpxgYxGSY84pRocPnrtdVGE4+INtr6vf0EvUaYyGFMjnyjncOifqswEI5iA6d2W40F2E4koSafqtAFNgi0K7obXsca9tl9POJjWYVQzNB4GeZTI8Qhj30DX/ds/uJA/IKxQvixWhAHSc1/GnPtv+jF6e5zbkfGVo6dETX7fGbXdqkaYgfplAPgi3eJuAa2wD34xtK/W10pENiInLJXHx6wqcgHVlblZ0rcKvXyGWkguXOq7yAy1HcPWZHWQvrEdMEYRqmGkWO0gE4IMd23/AnY1sRD9xfR4fBcMqx4DHXCiQ/whzGgOgRJcL8sHZp2kedfyR1lz8t1eIgNof2jMz+iLfKQKcATC0Isf7aRnezWy0aPQqY8OrVt9fX5E4TGkes0RTQZTa63UzPTBEak4Fsw2cK5YRekCGbA/MhMRMDUzx3lrUBE+IolTCL9Go8M4yDImVmsYBIF1D1NmYcy6L7eU6ejUwPMlyzRJaHhTlAukUuY8ACrlkVIspGIRKDloRHRHojoZPBUzHzDEjJiZMzxdC4FRUXH6nKQgtvPko7LQBpYsYc0jWOShI8NjkjMxTRkEcmFdgMQHdvL+tUJzxTVetOUgqmcSr5kCdECBZp7LZ5NzxGhd7A39gcEGSfQbVXzMmiU4lKCmn38KFDfED9+LZi5A8OUGCOV+WKQRAMa4vacYSY8sxFqliOBjKNM/mBC/wC6GQgERwDSWezdNKBExp7+xMLho7zZ7zefXux8fZdNa/XvPveQ6/l9byX3lvk9f3tj1aU7WftV+13/V79Tf1j/VPJXTlUsW5682t+re/621ALA=</latexit>ˆ T = argmax
T
Y
i
P(ti | w i+j
i−j , ti−1 i−k)))
<latexit sha1_base64="n95aMDbgXepW8IPRq1g025JqVU=">AGqXicnVTbhMxEN02Eq4tfCIhAxV1ISGKilI8FKpAh54AVKpN6jTldfrTZx6L/J6c5HlR76GV/gY/gbuwlJk7YSK8U7mTlnfGY8Xi9hNBXN5p+V1dKt2+U7a3cr9+4/ePhofePxcRpnHJMjHLOYn3oJYxG5EhQwchpwgkKPUZOvIsPJn4yIDylcXQoxgnphKgb0YBiJLTL3Sg9q1RhDwkplNs6j8AegIh3QzRypTAOBdo1awAYUh8MjVmvVBdhMOAIy3bNIizWBuoFva4msbpaRr+aWKkWMTAT1Pwk4fEIQL36rqR7LXUeGSAtUNQSr0dr1CSpK9aqv4fvfiX+3JDrlYGlm49cEWtXze72jSW+GWg8+lwjdaRdvWV58q+qsWuMFoRS3poykqzcMIqIldSbZWHaqbCw1wunVE2NMr6lzS7b5q5EIvzF8jt160d4qljWkpJq2dBkhGSQ2KHhEIQOzHAgSG0rghtZKmFshoSEWqi9iCVA8GEj2MmPym1NK0W40bkhq941wt9EI5VJdktnu0lgcaFjBWohy1zebO037gEWjVRibTvG03Y1VDv0YZyGJBGYoTc9azUR0JOKCYkZUBWYpSRC+QF1yps0IhSTtSHuTFahqjw+CmOtfJID1zjIkCtN0HoaRqSXo4Z57LYWSaCdx1JoyQTJML5RkHGgIiB+SwAn3KCBRtrA2FOtVaAe0h3R+iPhx61mW16hA2ImC8Ehx2ZBnb3OUleqObJo7zQCuQkIkMchyGK/JcSBikbOyTAGVMmAEIJvayfjX8AU3SonXTlIwIGHPapRFijAQCmXerV89Ae1agR+JPiBOPmvVXxPCkYi5VpJfCKUPrAufm6uorkPSaIrU5nxZ0grQxZi+xAmJpMrHmMUpgV6Xx1kyJ3iBb4XqBCjQx5DjyTwtR+gpbV2eyUXjeHen9Xpn9+DN5v7Yl7XnKfOC6fmtJy3zr7zyWk7Rw4u/Sj9LP0q/S5vlw/Kp+XvOXR1peA8ceaeMv4L5V1HbA=</latexit>20
assigned to the previous word as a feature (along with any other useful features)!
Janet
NNP MD VB?
will back the bill <s> ti-2 ti-1 wi+1 wi wi-1 wi-2
Lexical fi±{0,1,2,3}, (mi−2,i−1), (mi−1,i), (mi−1,i+1), (mi,i+1), (mi+1,i+2), (mi−2,i−1,i), (mi−1,i,i+1), (mi,i+1,i+2), (mi−2,i−1,i+1), (mi−1,i+1,i+2) POS pi−{3,2,1}, ai+{0,1,2,3}, (pi−2,i−1), (ai+1,i+2), (pi−1, ai+1), (pi−2, pi−1, ai), (pi−2, pi−1, ai+1), (pi−1, ai, ai+1), (pi−1, ai+1, ai+2) Affix c:1, c:2, c:3, cn:, cn−1:, cn−2:, cn−3: Binary initial uppercase, all uppercase/lowercase, contains 1/2+ capital(s) not at the beginning, contains a (period/number/hyphen)
Table from: Choi and Palmer. Fast and Robust Part-of-Speech Tagging using Dynamic Model Selection. ACL 2012.
P(ti | wi, ti1) = exp(θ · f (ti, w i+j
ij , ti1 ik))
P
t02Y
exp(θ · f (t0, w i+j
ij , ti1 ik))
<latexit sha1_base64="Rl61uDKpuzo1hvhob8vcbSy82E=">AGl3icnVTbhMxEF0aCBAKtPCEeDFUEQmEKgEkeEgQIgXIEi9gOp05fV6E6fei2xv2sjyh/AKX8XfMN7dtkmTFglL8U5mzhmfGc9ukAmudLf759JK7fKV+tVr1xs3Vm/eur2fmdHpbmkbJumIpXfA6KY4Anb1lwL9j2TjMSBYLvBwXsX350wqXiabOlpxgYxGSY84pRocPnrtdVGE4+INtr6vf0EvUaYyGFMjnyjncOifqswEI5iA6d2W40F2E4koSafqtAFNgi0K7obXsca9tl9POJjWYVQzNB4GeZTI8Qhj30DX/ds/uJA/IKxQvixWhAHSc1/GnPtv+jF6e5zbkfGVo6dETX7fGbXdqkaYgfplAPgi3eJuAa2wD34xtK/W10pENiInLJXHx6wqcgHVlblZ0rcKvXyGWkguXOq7yAy1HcPWZHWQvrEdMEYRqmGkWO0gE4IMd23/AnY1sRD9xfR4fBcMqx4DHXCiQ/whzGgOgRJcL8sHZp2kedfyR1lz8t1eIgNof2jMz+iLfKQKcATC0Isf7aRnezWy0aPQqY8OrVt9fX5E4TGkes0RTQZTa63UzPTBEak4Fsw2cK5YRekCGbA/MhMRMDUzx3lrUBE+IolTCL9Go8M4yDImVmsYBIF1D1NmYcy6L7eU6ejUwPMlyzRJaHhTlAukUuY8ACrlkVIspGIRKDloRHRHojoZPBUzHzDEjJiZMzxdC4FRUXH6nKQgtvPko7LQBpYsYc0jWOShI8NjkjMxTRkEcmFdgMQHdvL+tUJzxTVetOUgqmcSr5kCdECBZp7LZ5NzxGhd7A39gcEGSfQbVXzMmiU4lKCmn38KFDfED9+LZi5A8OUGCOV+WKQRAMa4vacYSY8sxFqliOBjKNM/mBC/wC6GQgERwDSWezdNKBExp7+xMLho7zZ7zefXux8fZdNa/XvPveQ6/l9byX3lvk9f3tj1aU7WftV+13/V79Tf1j/VPJXTlUsW5682t+re/621ALA=</latexit>ˆ T = argmax
T
Y
i
P(ti | w i+j
i−j , ti−1 i−k)))
<latexit sha1_base64="n95aMDbgXepW8IPRq1g025JqVU=">AGqXicnVTbhMxEN02Eq4tfCIhAxV1ISGKilI8FKpAh54AVKpN6jTldfrTZx6L/J6c5HlR76GV/gY/gbuwlJk7YSK8U7mTlnfGY8Xi9hNBXN5p+V1dKt2+U7a3cr9+4/ePhofePxcRpnHJMjHLOYn3oJYxG5EhQwchpwgkKPUZOvIsPJn4yIDylcXQoxgnphKgb0YBiJLTL3Sg9q1RhDwkplNs6j8AegIh3QzRypTAOBdo1awAYUh8MjVmvVBdhMOAIy3bNIizWBuoFva4msbpaRr+aWKkWMTAT1Pwk4fEIQL36rqR7LXUeGSAtUNQSr0dr1CSpK9aqv4fvfiX+3JDrlYGlm49cEWtXze72jSW+GWg8+lwjdaRdvWV58q+qsWuMFoRS3poykqzcMIqIldSbZWHaqbCw1wunVE2NMr6lzS7b5q5EIvzF8jt160d4qljWkpJq2dBkhGSQ2KHhEIQOzHAgSG0rghtZKmFshoSEWqi9iCVA8GEj2MmPym1NK0W40bkhq941wt9EI5VJdktnu0lgcaFjBWohy1zebO037gEWjVRibTvG03Y1VDv0YZyGJBGYoTc9azUR0JOKCYkZUBWYpSRC+QF1yps0IhSTtSHuTFahqjw+CmOtfJID1zjIkCtN0HoaRqSXo4Z57LYWSaCdx1JoyQTJML5RkHGgIiB+SwAn3KCBRtrA2FOtVaAe0h3R+iPhx61mW16hA2ImC8Ehx2ZBnb3OUleqObJo7zQCuQkIkMchyGK/JcSBikbOyTAGVMmAEIJvayfjX8AU3SonXTlIwIGHPapRFijAQCmXerV89Ae1agR+JPiBOPmvVXxPCkYi5VpJfCKUPrAufm6uorkPSaIrU5nxZ0grQxZi+xAmJpMrHmMUpgV6Xx1kyJ3iBb4XqBCjQx5DjyTwtR+gpbV2eyUXjeHen9Xpn9+DN5v7Yl7XnKfOC6fmtJy3zr7zyWk7Rw4u/Sj9LP0q/S5vlw/Kp+XvOXR1peA8ceaeMv4L5V1HbA=</latexit>21
wi contains a particular prefix (from all prefixes of length 4) wi contains a particular suffix (from all suffixes of length 4) wi contains a number wi contains an upper-case letter wi contains a hyphen wi is all upper case wi’s word shape wi’s short word shape wi is upper case and has a digit and a dash (like CFC-12) wi is upper case and followed within 3 words by Co., Inc., etc.
21
wi contains a particular prefix (from all prefixes of length 4) wi contains a particular suffix (from all suffixes of length 4) wi contains a number wi contains an upper-case letter wi contains a hyphen wi is all upper case wi’s word shape wi’s short word shape wi is upper case and has a digit and a dash (like CFC-12) wi is upper case and followed within 3 words by Co., Inc., etc.
22
prefix(wi) = L suffix(wi) = tane prefix(wi) = L’ suffix(wi) = ane prefix(wi) = L’O suffix(wi) = ne prefix(wi) = L’Oc suffix(wi) = e word-shape(wi) = X’Xxxxxxxx short-word-shape(wi) = X’Xx
22
prefix(wi) = L suffix(wi) = tane prefix(wi) = L’ suffix(wi) = ane prefix(wi) = L’O suffix(wi) = ne prefix(wi) = L’Oc suffix(wi) = e word-shape(wi) = X’Xxxxxxxx short-word-shape(wi) = X’Xx
23
man the
?? DT
boat
NN
23
■ Greediness: Transitions leaving a given state compete only against each other, rather
than against all other transitions in the model.
man the
?? DT
boat
NN
23
■ Greediness: Transitions leaving a given state compete only against each other, rather
than against all other transitions in the model.
■ Leads to locally optimal decisions that are not globally optimal.
man the
?? DT
boat
NN
23
■ Greediness: Transitions leaving a given state compete only against each other, rather
than against all other transitions in the model.
■ Leads to locally optimal decisions that are not globally optimal.
man the the
?? ?? DT DT
boat
NN
23
■ Greediness: Transitions leaving a given state compete only against each other, rather
than against all other transitions in the model.
■ Leads to locally optimal decisions that are not globally optimal.
man the the
?? ?? DT DT
boat
NN NN? ADJ?
23
■ Greediness: Transitions leaving a given state compete only against each other, rather
than against all other transitions in the model.
■ Leads to locally optimal decisions that are not globally optimal.
man the the
?? ?? DT DT
boat
NN NN? ADJ?
23
■ Greediness: Transitions leaving a given state compete only against each other, rather
than against all other transitions in the model.
■ Leads to locally optimal decisions that are not globally optimal.
man the the
?? ?? DT DT
boat
NN NN? VB? NN? ADJ?
23
■ Greediness: Transitions leaving a given state compete only against each other, rather
than against all other transitions in the model.
■ Leads to locally optimal decisions that are not globally optimal.
man the the
?? ?? DT DT
boat
NN NN? VB? NN? ADJ?
23
■ Greediness: Transitions leaving a given state compete only against each other, rather
than against all other transitions in the model.
■ Leads to locally optimal decisions that are not globally optimal.
man the the
Are HMMs subject to the label bias problem?
?? ?? DT DT
boat
NN NN? VB? NN? ADJ?
23
■ Greediness: Transitions leaving a given state compete only against each other, rather
than against all other transitions in the model.
■ Leads to locally optimal decisions that are not globally optimal. ■ All locally normalized models (vs. globally normalized) suffer from this problem.
man the the
Are HMMs subject to the label bias problem?
?? ?? DT DT
boat
NN NN? VB? NN? ADJ?
24
24
24
is proportional to a product of scores across pairs (or cliques) of variables (suitable for structured prediction, not just sequence labeling).
24
is proportional to a product of scores across pairs (or cliques) of variables (suitable for structured prediction, not just sequence labeling).
, usually we mean a linear-chain CRF where pairs of variables are labels for adjacent tokens.
24
is proportional to a product of scores across pairs (or cliques) of variables (suitable for structured prediction, not just sequence labeling).
, usually we mean a linear-chain CRF where pairs of variables are labels for adjacent tokens.
labeling in NLP: feature functions, normalization over possible segments.
25
Logistic Regression HMMs Linear-chain CRFs Naive Bayes
SEQUENCE SEQUENCE CONDITIONAL CONDITIONAL
Generative directed models General CRFs
CONDITIONAL General GRAPHS General GRAPHS
Image from: Sutton and McCallum. An Introduction to Conditional Random Fields for Relational Learning. In Introduction to Statistical Relational Learning. 2012.
P(y | w) = exp(Ψ(w, y)) P
y0∈Y(w)
exp(Ψ(w, y0))
<latexit sha1_base64="Hl4SzGDfq/v1/mRkJxB+iyWEgs4=">AH9nicnVTPb9s2FayZG69X0137IVbEFha3cDqCmyXAkW3wy7tPMBpOoSOQFGUTYeUBIpybBD8V3obdt3fstv+kx1HUrJrOU4CTICl5/e+7+P3HgnGBaOlHAz+2dv/5ODw086Dh93Pv/iy68eHT1+V+aVwOQM5ywX72NUEkYzciapZOR9IQjiMSPn8dVPtn4+J6KkeTaSy4KMOZpkNKUYSZOKjg6uidwiqSOgovM/ASQCQmHC0iJW1Cg6HvAgA5TcC1DYPuyU0YTAXCaug7hMO6QtDQA72qBXoX/XZi96SpgY2i4ReFyBcAmncSKfoy1JeZBdIGR3xbrRBrUQVfRbq4H/M4qP29kBudwZ2Lj2PpD8L7KpOxhHfzo2eKfs0QCY103GkZtrPI2m9IlZM0ZpVnzFaip3UG2bI73R4qj2SzesXVtrM32p6NOZ7tdOr+xf6zdo5rvG0v6FyvrjgMki8KHckokAhAnuQSpfTvkdbKNgMZ5VSWposepOZkIDnFiKnftd4p2+vfI2r9jmq359sWhyX1z/tgtL30qAfM2mBzcdBm9KzwR357I9RbDQtTur9n0IjAmKtrU7XfpV7Jrf1wp/rmagtoB6BbmZgM6leKywjbl+KG/k+4EF36NeibgI1atcYthxsT8Sle3p7LA0t0LcI9axS9Oh4cDpwD7gZhE1w7DXPMDraFzDJcVJjFDZXkRDgo5VkhIihnRXViVpED4Ck3IhQkzxEk5Vu5a1ODEZBKQ5sL8MglcdpOhEC/LJY8N0jZSbtdsclftopLpj2NFs6KSJMP1QmnFgMyBvWNBQgXBki1NgLCgxivAU2TGLM1NbDZ6Y5kpYXMi241gPlZl6lZvWYq5bpMXdaNdKEhGrnHOcqS7xRMEadsmZAUVUza/UtX8a59ZM5LcpmdGtJRiTMBZ3QDFGUgntq502n6mE7t2FPxOzQYK8Ma5/LYhAMhfGSX25aLNhE/iNvdf0XUiarZEmbLelnAHTjJ1LXpBM6fr4sbwkMJ6IvCpahm/wnVEjgFKzDTWetGk1wpzScPtM3gzePT8Nvz9/tuL41evm/P6wHvifev5Xuj94L3yfvG3pmHD/4+Pdw73C/s+h86PzR+bOG7u81nK+91tP56z9G0Lzs</latexit>26
Ψ(w, y) =
M+1
X
m=1
θ · f(w, ym, ym−1, m)
<latexit sha1_base64="6QgzZybjFg9uG/YPephMOPdGJY8=">AHcnicnVTdbts2FZTr+68nzbdXvDLghirV5gdQXWmwJFt4vdNPMAp+kQOgJFUTYdUhJIyolBcO+5J9gL7AFGUrJrOW4CTICp43PO953vHFJMSkalGg7/vrd3v/PFg+7DL3tf3Nt48e7z/5ItKYHKC1aIjwmShNGcnCqGPlYCoJ4wshZcvmLi58tiJC0yMdqWZIJR9OcZhQjZV3x/v1/eodwhpRWJo4ucvAGQCSmHF3HWjmHAaO+NwDkNAVXzgx7hzfTYCYQ1qO+z/C5PhA28NCsYqHZBf8sHfYxMBG0OLUhTXANo1jTV9E5mL3CXSJot64O3ZNmtFqumPkQn/xyw+cW8P5PKwM7Si1j156Gr6mk8GRh+Wy4T0NkXOTxHpu+kWsnFbEyhlao2TFV6gmcgvUtTk2Gy2Oa710Q9qVkzY3F5q+mJtBrfTS/XV6w2a+61w6WPfiaP1xgOS67EM1IwoBiNCgcxBndQG+2agYxyqTt4ghSezKQmHE9J/G7KQ9GtxB6vSOa7Vn2xJHkvbPBmC8Xp8BGxtsFkctBFHjvgTvr0R+sTA0obu7hn0PAdMuL6yQfdemhXbWg73pO9fRMYl1BMwzQicJzNrhmXM3aK5ZR8AHsaPD4bHQ/+Am0bUGAdB84zi/T0B0wJXnOQKMyTleTQs1UQjoShmxPRgJUmJ8CWaknNr5ogTOdH+SjLg0HpSkBXC/nIFvHcToRGXcskTm+lmK7djzrkrdl6p7PVE07ysFMlxXSirGFAFcPcbSKkgWLGlNRAW1GoFeIbsTit7C9qd2igzI2xBVLsRzCdaZr56S1LCTRt8XTfag4Lk5AoXnKM8/UHDHKlinJUMWUO03Zyt41r0G6oKVsRremZETBQtApzRFjJFPQLW23fc0U9GsP/krsBgny3qr+vSQCqUJYJfWHbeyGTeFzd6eY2zJpvs60Zrst7QXYZtxcipLk2tRfBCskgclUFXZEnwD74VaApTZbajzSRtWZ9hTGm2fyZvGh5fH0U/HL/94dfD2XNeHwbPgu+DfhAFPwdvg9+CUXAa4M5JR3VM568H/3afdp93m8O9d6/BfBe0nu7gPxLUizA=</latexit>P(y | w) = exp(Ψ(w, y)) P
y0∈Y(w)
exp(Ψ(w, y0))
<latexit sha1_base64="Hl4SzGDfq/v1/mRkJxB+iyWEgs4=">AH9nicnVTPb9s2FayZG69X0137IVbEFha3cDqCmyXAkW3wy7tPMBpOoSOQFGUTYeUBIpybBD8V3obdt3fstv+kx1HUrJrOU4CTICl5/e+7+P3HgnGBaOlHAz+2dv/5ODw086Dh93Pv/iy68eHT1+V+aVwOQM5ywX72NUEkYzciapZOR9IQjiMSPn8dVPtn4+J6KkeTaSy4KMOZpkNKUYSZOKjg6uidwiqSOgovM/ASQCQmHC0iJW1Cg6HvAgA5TcC1DYPuyU0YTAXCaug7hMO6QtDQA72qBXoX/XZi96SpgY2i4ReFyBcAmncSKfoy1JeZBdIGR3xbrRBrUQVfRbq4H/M4qP29kBudwZ2Lj2PpD8L7KpOxhHfzo2eKfs0QCY103GkZtrPI2m9IlZM0ZpVnzFaip3UG2bI73R4qj2SzesXVtrM32p6NOZ7tdOr+xf6zdo5rvG0v6FyvrjgMki8KHckokAhAnuQSpfTvkdbKNgMZ5VSWposepOZkIDnFiKnftd4p2+vfI2r9jmq359sWhyX1z/tgtL30qAfM2mBzcdBm9KzwR357I9RbDQtTur9n0IjAmKtrU7XfpV7Jrf1wp/rmagtoB6BbmZgM6leKywjbl+KG/k+4EF36NeibgI1atcYthxsT8Sle3p7LA0t0LcI9axS9Oh4cDpwD7gZhE1w7DXPMDraFzDJcVJjFDZXkRDgo5VkhIihnRXViVpED4Ck3IhQkzxEk5Vu5a1ODEZBKQ5sL8MglcdpOhEC/LJY8N0jZSbtdsclftopLpj2NFs6KSJMP1QmnFgMyBvWNBQgXBki1NgLCgxivAU2TGLM1NbDZ6Y5kpYXMi241gPlZl6lZvWYq5bpMXdaNdKEhGrnHOcqS7xRMEadsmZAUVUza/UtX8a59ZM5LcpmdGtJRiTMBZ3QDFGUgntq502n6mE7t2FPxOzQYK8Ma5/LYhAMhfGSX25aLNhE/iNvdf0XUiarZEmbLelnAHTjJ1LXpBM6fr4sbwkMJ6IvCpahm/wnVEjgFKzDTWetGk1wpzScPtM3gzePT8Nvz9/tuL41evm/P6wHvifev5Xuj94L3yfvG3pmHD/4+Pdw73C/s+h86PzR+bOG7u81nK+91tP56z9G0Lzs</latexit>26
y’ is an entire sequence
Ψ(w, y) =
M+1
X
m=1
θ · f(w, ym, ym−1, m)
<latexit sha1_base64="6QgzZybjFg9uG/YPephMOPdGJY8=">AHcnicnVTdbts2FZTr+68nzbdXvDLghirV5gdQXWmwJFt4vdNPMAp+kQOgJFUTYdUhJIyolBcO+5J9gL7AFGUrJrOW4CTICp43PO953vHFJMSkalGg7/vrd3v/PFg+7DL3tf3Nt48e7z/5ItKYHKC1aIjwmShNGcnCqGPlYCoJ4wshZcvmLi58tiJC0yMdqWZIJR9OcZhQjZV3x/v1/eodwhpRWJo4ucvAGQCSmHF3HWjmHAaO+NwDkNAVXzgx7hzfTYCYQ1qO+z/C5PhA28NCsYqHZBf8sHfYxMBG0OLUhTXANo1jTV9E5mL3CXSJot64O3ZNmtFqumPkQn/xyw+cW8P5PKwM7Si1j156Gr6mk8GRh+Wy4T0NkXOTxHpu+kWsnFbEyhlao2TFV6gmcgvUtTk2Gy2Oa710Q9qVkzY3F5q+mJtBrfTS/XV6w2a+61w6WPfiaP1xgOS67EM1IwoBiNCgcxBndQG+2agYxyqTt4ghSezKQmHE9J/G7KQ9GtxB6vSOa7Vn2xJHkvbPBmC8Xp8BGxtsFkctBFHjvgTvr0R+sTA0obu7hn0PAdMuL6yQfdemhXbWg73pO9fRMYl1BMwzQicJzNrhmXM3aK5ZR8AHsaPD4bHQ/+Am0bUGAdB84zi/T0B0wJXnOQKMyTleTQs1UQjoShmxPRgJUmJ8CWaknNr5ogTOdH+SjLg0HpSkBXC/nIFvHcToRGXcskTm+lmK7djzrkrdl6p7PVE07ysFMlxXSirGFAFcPcbSKkgWLGlNRAW1GoFeIbsTit7C9qd2igzI2xBVLsRzCdaZr56S1LCTRt8XTfag4Lk5AoXnKM8/UHDHKlinJUMWUO03Zyt41r0G6oKVsRremZETBQtApzRFjJFPQLW23fc0U9GsP/krsBgny3qr+vSQCqUJYJfWHbeyGTeFzd6eY2zJpvs60Zrst7QXYZtxcipLk2tRfBCskgclUFXZEnwD74VaApTZbajzSRtWZ9hTGm2fyZvGh5fH0U/HL/94dfD2XNeHwbPgu+DfhAFPwdvg9+CUXAa4M5JR3VM568H/3afdp93m8O9d6/BfBe0nu7gPxLUizA=</latexit>P(y | w) = exp(Ψ(w, y)) P
y0∈Y(w)
exp(Ψ(w, y0))
<latexit sha1_base64="Hl4SzGDfq/v1/mRkJxB+iyWEgs4=">AH9nicnVTPb9s2FayZG69X0137IVbEFha3cDqCmyXAkW3wy7tPMBpOoSOQFGUTYeUBIpybBD8V3obdt3fstv+kx1HUrJrOU4CTICl5/e+7+P3HgnGBaOlHAz+2dv/5ODw086Dh93Pv/iy68eHT1+V+aVwOQM5ywX72NUEkYzciapZOR9IQjiMSPn8dVPtn4+J6KkeTaSy4KMOZpkNKUYSZOKjg6uidwiqSOgovM/ASQCQmHC0iJW1Cg6HvAgA5TcC1DYPuyU0YTAXCaug7hMO6QtDQA72qBXoX/XZi96SpgY2i4ReFyBcAmncSKfoy1JeZBdIGR3xbrRBrUQVfRbq4H/M4qP29kBudwZ2Lj2PpD8L7KpOxhHfzo2eKfs0QCY103GkZtrPI2m9IlZM0ZpVnzFaip3UG2bI73R4qj2SzesXVtrM32p6NOZ7tdOr+xf6zdo5rvG0v6FyvrjgMki8KHckokAhAnuQSpfTvkdbKNgMZ5VSWposepOZkIDnFiKnftd4p2+vfI2r9jmq359sWhyX1z/tgtL30qAfM2mBzcdBm9KzwR357I9RbDQtTur9n0IjAmKtrU7XfpV7Jrf1wp/rmagtoB6BbmZgM6leKywjbl+KG/k+4EF36NeibgI1atcYthxsT8Sle3p7LA0t0LcI9axS9Oh4cDpwD7gZhE1w7DXPMDraFzDJcVJjFDZXkRDgo5VkhIihnRXViVpED4Ck3IhQkzxEk5Vu5a1ODEZBKQ5sL8MglcdpOhEC/LJY8N0jZSbtdsclftopLpj2NFs6KSJMP1QmnFgMyBvWNBQgXBki1NgLCgxivAU2TGLM1NbDZ6Y5kpYXMi241gPlZl6lZvWYq5bpMXdaNdKEhGrnHOcqS7xRMEadsmZAUVUza/UtX8a59ZM5LcpmdGtJRiTMBZ3QDFGUgntq502n6mE7t2FPxOzQYK8Ma5/LYhAMhfGSX25aLNhE/iNvdf0XUiarZEmbLelnAHTjJ1LXpBM6fr4sbwkMJ6IvCpahm/wnVEjgFKzDTWetGk1wpzScPtM3gzePT8Nvz9/tuL41evm/P6wHvifev5Xuj94L3yfvG3pmHD/4+Pdw73C/s+h86PzR+bOG7u81nK+91tP56z9G0Lzs</latexit>26
y’ is an entire sequence
Ψ(w, y) =
M+1
X
m=1
θ · f(w, ym, ym−1, m)
<latexit sha1_base64="6QgzZybjFg9uG/YPephMOPdGJY8=">AHcnicnVTdbts2FZTr+68nzbdXvDLghirV5gdQXWmwJFt4vdNPMAp+kQOgJFUTYdUhJIyolBcO+5J9gL7AFGUrJrOW4CTICp43PO953vHFJMSkalGg7/vrd3v/PFg+7DL3tf3Nt48e7z/5ItKYHKC1aIjwmShNGcnCqGPlYCoJ4wshZcvmLi58tiJC0yMdqWZIJR9OcZhQjZV3x/v1/eodwhpRWJo4ucvAGQCSmHF3HWjmHAaO+NwDkNAVXzgx7hzfTYCYQ1qO+z/C5PhA28NCsYqHZBf8sHfYxMBG0OLUhTXANo1jTV9E5mL3CXSJot64O3ZNmtFqumPkQn/xyw+cW8P5PKwM7Si1j156Gr6mk8GRh+Wy4T0NkXOTxHpu+kWsnFbEyhlao2TFV6gmcgvUtTk2Gy2Oa710Q9qVkzY3F5q+mJtBrfTS/XV6w2a+61w6WPfiaP1xgOS67EM1IwoBiNCgcxBndQG+2agYxyqTt4ghSezKQmHE9J/G7KQ9GtxB6vSOa7Vn2xJHkvbPBmC8Xp8BGxtsFkctBFHjvgTvr0R+sTA0obu7hn0PAdMuL6yQfdemhXbWg73pO9fRMYl1BMwzQicJzNrhmXM3aK5ZR8AHsaPD4bHQ/+Am0bUGAdB84zi/T0B0wJXnOQKMyTleTQs1UQjoShmxPRgJUmJ8CWaknNr5ogTOdH+SjLg0HpSkBXC/nIFvHcToRGXcskTm+lmK7djzrkrdl6p7PVE07ysFMlxXSirGFAFcPcbSKkgWLGlNRAW1GoFeIbsTit7C9qd2igzI2xBVLsRzCdaZr56S1LCTRt8XTfag4Lk5AoXnKM8/UHDHKlinJUMWUO03Zyt41r0G6oKVsRremZETBQtApzRFjJFPQLW23fc0U9GsP/krsBgny3qr+vSQCqUJYJfWHbeyGTeFzd6eY2zJpvs60Zrst7QXYZtxcipLk2tRfBCskgclUFXZEnwD74VaApTZbajzSRtWZ9hTGm2fyZvGh5fH0U/HL/94dfD2XNeHwbPgu+DfhAFPwdvg9+CUXAa4M5JR3VM568H/3afdp93m8O9d6/BfBe0nu7gPxLUizA=</latexit>logistic regression, plus weights over pairs of labels
P(y | w) = exp(Ψ(w, y)) P
y0∈Y(w)
exp(Ψ(w, y0))
<latexit sha1_base64="Hl4SzGDfq/v1/mRkJxB+iyWEgs4=">AH9nicnVTPb9s2FayZG69X0137IVbEFha3cDqCmyXAkW3wy7tPMBpOoSOQFGUTYeUBIpybBD8V3obdt3fstv+kx1HUrJrOU4CTICl5/e+7+P3HgnGBaOlHAz+2dv/5ODw086Dh93Pv/iy68eHT1+V+aVwOQM5ywX72NUEkYzciapZOR9IQjiMSPn8dVPtn4+J6KkeTaSy4KMOZpkNKUYSZOKjg6uidwiqSOgovM/ASQCQmHC0iJW1Cg6HvAgA5TcC1DYPuyU0YTAXCaug7hMO6QtDQA72qBXoX/XZi96SpgY2i4ReFyBcAmncSKfoy1JeZBdIGR3xbrRBrUQVfRbq4H/M4qP29kBudwZ2Lj2PpD8L7KpOxhHfzo2eKfs0QCY103GkZtrPI2m9IlZM0ZpVnzFaip3UG2bI73R4qj2SzesXVtrM32p6NOZ7tdOr+xf6zdo5rvG0v6FyvrjgMki8KHckokAhAnuQSpfTvkdbKNgMZ5VSWposepOZkIDnFiKnftd4p2+vfI2r9jmq359sWhyX1z/tgtL30qAfM2mBzcdBm9KzwR357I9RbDQtTur9n0IjAmKtrU7XfpV7Jrf1wp/rmagtoB6BbmZgM6leKywjbl+KG/k+4EF36NeibgI1atcYthxsT8Sle3p7LA0t0LcI9axS9Oh4cDpwD7gZhE1w7DXPMDraFzDJcVJjFDZXkRDgo5VkhIihnRXViVpED4Ck3IhQkzxEk5Vu5a1ODEZBKQ5sL8MglcdpOhEC/LJY8N0jZSbtdsclftopLpj2NFs6KSJMP1QmnFgMyBvWNBQgXBki1NgLCgxivAU2TGLM1NbDZ6Y5kpYXMi241gPlZl6lZvWYq5bpMXdaNdKEhGrnHOcqS7xRMEadsmZAUVUza/UtX8a59ZM5LcpmdGtJRiTMBZ3QDFGUgntq502n6mE7t2FPxOzQYK8Ma5/LYhAMhfGSX25aLNhE/iNvdf0XUiarZEmbLelnAHTjJ1LXpBM6fr4sbwkMJ6IvCpahm/wnVEjgFKzDTWetGk1wpzScPtM3gzePT8Nvz9/tuL41evm/P6wHvifev5Xuj94L3yfvG3pmHD/4+Pdw73C/s+h86PzR+bOG7u81nK+91tP56z9G0Lzs</latexit>26
y’ is an entire sequence
Ψ(w, y) =
M+1
X
m=1
θ · f(w, ym, ym−1, m)
<latexit sha1_base64="6QgzZybjFg9uG/YPephMOPdGJY8=">AHcnicnVTdbts2FZTr+68nzbdXvDLghirV5gdQXWmwJFt4vdNPMAp+kQOgJFUTYdUhJIyolBcO+5J9gL7AFGUrJrOW4CTICp43PO953vHFJMSkalGg7/vrd3v/PFg+7DL3tf3Nt48e7z/5ItKYHKC1aIjwmShNGcnCqGPlYCoJ4wshZcvmLi58tiJC0yMdqWZIJR9OcZhQjZV3x/v1/eodwhpRWJo4ucvAGQCSmHF3HWjmHAaO+NwDkNAVXzgx7hzfTYCYQ1qO+z/C5PhA28NCsYqHZBf8sHfYxMBG0OLUhTXANo1jTV9E5mL3CXSJot64O3ZNmtFqumPkQn/xyw+cW8P5PKwM7Si1j156Gr6mk8GRh+Wy4T0NkXOTxHpu+kWsnFbEyhlao2TFV6gmcgvUtTk2Gy2Oa710Q9qVkzY3F5q+mJtBrfTS/XV6w2a+61w6WPfiaP1xgOS67EM1IwoBiNCgcxBndQG+2agYxyqTt4ghSezKQmHE9J/G7KQ9GtxB6vSOa7Vn2xJHkvbPBmC8Xp8BGxtsFkctBFHjvgTvr0R+sTA0obu7hn0PAdMuL6yQfdemhXbWg73pO9fRMYl1BMwzQicJzNrhmXM3aK5ZR8AHsaPD4bHQ/+Am0bUGAdB84zi/T0B0wJXnOQKMyTleTQs1UQjoShmxPRgJUmJ8CWaknNr5ogTOdH+SjLg0HpSkBXC/nIFvHcToRGXcskTm+lmK7djzrkrdl6p7PVE07ysFMlxXSirGFAFcPcbSKkgWLGlNRAW1GoFeIbsTit7C9qd2igzI2xBVLsRzCdaZr56S1LCTRt8XTfag4Lk5AoXnKM8/UHDHKlinJUMWUO03Zyt41r0G6oKVsRremZETBQtApzRFjJFPQLW23fc0U9GsP/krsBgny3qr+vSQCqUJYJfWHbeyGTeFzd6eY2zJpvs60Zrst7QXYZtxcipLk2tRfBCskgclUFXZEnwD74VaApTZbajzSRtWZ9hTGm2fyZvGh5fH0U/HL/94dfD2XNeHwbPgu+DfhAFPwdvg9+CUXAa4M5JR3VM568H/3afdp93m8O9d6/BfBe0nu7gPxLUizA=</latexit>logistic regression, plus weights over pairs of labels
P(y | w) = exp(Ψ(w, y)) P
y0∈Y(w)
exp(Ψ(w, y0))
<latexit sha1_base64="Hl4SzGDfq/v1/mRkJxB+iyWEgs4=">AH9nicnVTPb9s2FayZG69X0137IVbEFha3cDqCmyXAkW3wy7tPMBpOoSOQFGUTYeUBIpybBD8V3obdt3fstv+kx1HUrJrOU4CTICl5/e+7+P3HgnGBaOlHAz+2dv/5ODw086Dh93Pv/iy68eHT1+V+aVwOQM5ywX72NUEkYzciapZOR9IQjiMSPn8dVPtn4+J6KkeTaSy4KMOZpkNKUYSZOKjg6uidwiqSOgovM/ASQCQmHC0iJW1Cg6HvAgA5TcC1DYPuyU0YTAXCaug7hMO6QtDQA72qBXoX/XZi96SpgY2i4ReFyBcAmncSKfoy1JeZBdIGR3xbrRBrUQVfRbq4H/M4qP29kBudwZ2Lj2PpD8L7KpOxhHfzo2eKfs0QCY103GkZtrPI2m9IlZM0ZpVnzFaip3UG2bI73R4qj2SzesXVtrM32p6NOZ7tdOr+xf6zdo5rvG0v6FyvrjgMki8KHckokAhAnuQSpfTvkdbKNgMZ5VSWposepOZkIDnFiKnftd4p2+vfI2r9jmq359sWhyX1z/tgtL30qAfM2mBzcdBm9KzwR357I9RbDQtTur9n0IjAmKtrU7XfpV7Jrf1wp/rmagtoB6BbmZgM6leKywjbl+KG/k+4EF36NeibgI1atcYthxsT8Sle3p7LA0t0LcI9axS9Oh4cDpwD7gZhE1w7DXPMDraFzDJcVJjFDZXkRDgo5VkhIihnRXViVpED4Ck3IhQkzxEk5Vu5a1ODEZBKQ5sL8MglcdpOhEC/LJY8N0jZSbtdsclftopLpj2NFs6KSJMP1QmnFgMyBvWNBQgXBki1NgLCgxivAU2TGLM1NbDZ6Y5kpYXMi241gPlZl6lZvWYq5bpMXdaNdKEhGrnHOcqS7xRMEadsmZAUVUza/UtX8a59ZM5LcpmdGtJRiTMBZ3QDFGUgntq502n6mE7t2FPxOzQYK8Ma5/LYhAMhfGSX25aLNhE/iNvdf0XUiarZEmbLelnAHTjJ1LXpBM6fr4sbwkMJ6IvCpahm/wnVEjgFKzDTWetGk1wpzScPtM3gzePT8Nvz9/tuL41evm/P6wHvifev5Xuj94L3yfvG3pmHD/4+Pdw73C/s+h86PzR+bOG7u81nK+91tP56z9G0Lzs</latexit>26
y’ is an entire sequence
Ψ(w, y) =
M+1
X
m=1
θ · f(w, ym, ym−1, m)
<latexit sha1_base64="6QgzZybjFg9uG/YPephMOPdGJY8=">AHcnicnVTdbts2FZTr+68nzbdXvDLghirV5gdQXWmwJFt4vdNPMAp+kQOgJFUTYdUhJIyolBcO+5J9gL7AFGUrJrOW4CTICp43PO953vHFJMSkalGg7/vrd3v/PFg+7DL3tf3Nt48e7z/5ItKYHKC1aIjwmShNGcnCqGPlYCoJ4wshZcvmLi58tiJC0yMdqWZIJR9OcZhQjZV3x/v1/eodwhpRWJo4ucvAGQCSmHF3HWjmHAaO+NwDkNAVXzgx7hzfTYCYQ1qO+z/C5PhA28NCsYqHZBf8sHfYxMBG0OLUhTXANo1jTV9E5mL3CXSJot64O3ZNmtFqumPkQn/xyw+cW8P5PKwM7Si1j156Gr6mk8GRh+Wy4T0NkXOTxHpu+kWsnFbEyhlao2TFV6gmcgvUtTk2Gy2Oa710Q9qVkzY3F5q+mJtBrfTS/XV6w2a+61w6WPfiaP1xgOS67EM1IwoBiNCgcxBndQG+2agYxyqTt4ghSezKQmHE9J/G7KQ9GtxB6vSOa7Vn2xJHkvbPBmC8Xp8BGxtsFkctBFHjvgTvr0R+sTA0obu7hn0PAdMuL6yQfdemhXbWg73pO9fRMYl1BMwzQicJzNrhmXM3aK5ZR8AHsaPD4bHQ/+Am0bUGAdB84zi/T0B0wJXnOQKMyTleTQs1UQjoShmxPRgJUmJ8CWaknNr5ogTOdH+SjLg0HpSkBXC/nIFvHcToRGXcskTm+lmK7djzrkrdl6p7PVE07ysFMlxXSirGFAFcPcbSKkgWLGlNRAW1GoFeIbsTit7C9qd2igzI2xBVLsRzCdaZr56S1LCTRt8XTfag4Lk5AoXnKM8/UHDHKlinJUMWUO03Zyt41r0G6oKVsRremZETBQtApzRFjJFPQLW23fc0U9GsP/krsBgny3qr+vSQCqUJYJfWHbeyGTeFzd6eY2zJpvs60Zrst7QXYZtxcipLk2tRfBCskgclUFXZEnwD74VaApTZbajzSRtWZ9hTGm2fyZvGh5fH0U/HL/94dfD2XNeHwbPgu+DfhAFPwdvg9+CUXAa4M5JR3VM568H/3afdp93m8O9d6/BfBe0nu7gPxLUizA=</latexit>logistic regression, plus weights over pairs of labels
P(y | w) = exp(Ψ(w, y)) P
y0∈Y(w)
exp(Ψ(w, y0))
<latexit sha1_base64="Hl4SzGDfq/v1/mRkJxB+iyWEgs4=">AH9nicnVTPb9s2FayZG69X0137IVbEFha3cDqCmyXAkW3wy7tPMBpOoSOQFGUTYeUBIpybBD8V3obdt3fstv+kx1HUrJrOU4CTICl5/e+7+P3HgnGBaOlHAz+2dv/5ODw086Dh93Pv/iy68eHT1+V+aVwOQM5ywX72NUEkYzciapZOR9IQjiMSPn8dVPtn4+J6KkeTaSy4KMOZpkNKUYSZOKjg6uidwiqSOgovM/ASQCQmHC0iJW1Cg6HvAgA5TcC1DYPuyU0YTAXCaug7hMO6QtDQA72qBXoX/XZi96SpgY2i4ReFyBcAmncSKfoy1JeZBdIGR3xbrRBrUQVfRbq4H/M4qP29kBudwZ2Lj2PpD8L7KpOxhHfzo2eKfs0QCY103GkZtrPI2m9IlZM0ZpVnzFaip3UG2bI73R4qj2SzesXVtrM32p6NOZ7tdOr+xf6zdo5rvG0v6FyvrjgMki8KHckokAhAnuQSpfTvkdbKNgMZ5VSWposepOZkIDnFiKnftd4p2+vfI2r9jmq359sWhyX1z/tgtL30qAfM2mBzcdBm9KzwR357I9RbDQtTur9n0IjAmKtrU7XfpV7Jrf1wp/rmagtoB6BbmZgM6leKywjbl+KG/k+4EF36NeibgI1atcYthxsT8Sle3p7LA0t0LcI9axS9Oh4cDpwD7gZhE1w7DXPMDraFzDJcVJjFDZXkRDgo5VkhIihnRXViVpED4Ck3IhQkzxEk5Vu5a1ODEZBKQ5sL8MglcdpOhEC/LJY8N0jZSbtdsclftopLpj2NFs6KSJMP1QmnFgMyBvWNBQgXBki1NgLCgxivAU2TGLM1NbDZ6Y5kpYXMi241gPlZl6lZvWYq5bpMXdaNdKEhGrnHOcqS7xRMEadsmZAUVUza/UtX8a59ZM5LcpmdGtJRiTMBZ3QDFGUgntq502n6mE7t2FPxOzQYK8Ma5/LYhAMhfGSX25aLNhE/iNvdf0XUiarZEmbLelnAHTjJ1LXpBM6fr4sbwkMJ6IvCpahm/wnVEjgFKzDTWetGk1wpzScPtM3gzePT8Nvz9/tuL41evm/P6wHvifev5Xuj94L3yfvG3pmHD/4+Pdw73C/s+h86PzR+bOG7u81nK+91tP56z9G0Lzs</latexit>26
y’ is an entire sequence
Ψ(w, y) =
M+1
X
m=1
θ · f(w, ym, ym−1, m)
<latexit sha1_base64="6QgzZybjFg9uG/YPephMOPdGJY8=">AHcnicnVTdbts2FZTr+68nzbdXvDLghirV5gdQXWmwJFt4vdNPMAp+kQOgJFUTYdUhJIyolBcO+5J9gL7AFGUrJrOW4CTICp43PO953vHFJMSkalGg7/vrd3v/PFg+7DL3tf3Nt48e7z/5ItKYHKC1aIjwmShNGcnCqGPlYCoJ4wshZcvmLi58tiJC0yMdqWZIJR9OcZhQjZV3x/v1/eodwhpRWJo4ucvAGQCSmHF3HWjmHAaO+NwDkNAVXzgx7hzfTYCYQ1qO+z/C5PhA28NCsYqHZBf8sHfYxMBG0OLUhTXANo1jTV9E5mL3CXSJot64O3ZNmtFqumPkQn/xyw+cW8P5PKwM7Si1j156Gr6mk8GRh+Wy4T0NkXOTxHpu+kWsnFbEyhlao2TFV6gmcgvUtTk2Gy2Oa710Q9qVkzY3F5q+mJtBrfTS/XV6w2a+61w6WPfiaP1xgOS67EM1IwoBiNCgcxBndQG+2agYxyqTt4ghSezKQmHE9J/G7KQ9GtxB6vSOa7Vn2xJHkvbPBmC8Xp8BGxtsFkctBFHjvgTvr0R+sTA0obu7hn0PAdMuL6yQfdemhXbWg73pO9fRMYl1BMwzQicJzNrhmXM3aK5ZR8AHsaPD4bHQ/+Am0bUGAdB84zi/T0B0wJXnOQKMyTleTQs1UQjoShmxPRgJUmJ8CWaknNr5ogTOdH+SjLg0HpSkBXC/nIFvHcToRGXcskTm+lmK7djzrkrdl6p7PVE07ysFMlxXSirGFAFcPcbSKkgWLGlNRAW1GoFeIbsTit7C9qd2igzI2xBVLsRzCdaZr56S1LCTRt8XTfag4Lk5AoXnKM8/UHDHKlinJUMWUO03Zyt41r0G6oKVsRremZETBQtApzRFjJFPQLW23fc0U9GsP/krsBgny3qr+vSQCqUJYJfWHbeyGTeFzd6eY2zJpvs60Zrst7QXYZtxcipLk2tRfBCskgclUFXZEnwD74VaApTZbajzSRtWZ9hTGm2fyZvGh5fH0U/HL/94dfD2XNeHwbPgu+DfhAFPwdvg9+CUXAa4M5JR3VM568H/3afdp93m8O9d6/BfBe0nu7gPxLUizA=</latexit>logistic regression, plus weights over pairs of labels
27
28
Citing high fuel prices, [ORG United Airlines] said [TIME Friday] it has increased fares by [MONEY $6] per round trip on flights to some cities also served by lower-cost carriers. [ORG American Airlines], a unit of [ORG AMR Corp.], immediately matched the move, spokesman [PER Tim Wagner] said. [ORG United], a unit of [ORG UAL Corp.], said the increase took effect [TIME Thursday] and applies to most routes where it competes against discount carriers, such as [LOC Chicago] to [LOC Dallas] and [LOC Denver] to [LOC San Francisco].
29
Type Tag Sample Categories Example sentences People
PER people, characters
Turing is a giant of computer science. Organization ORG companies, sports teams The IPCC warned about the cyclone. Location
LOC regions, mountains, seas
The Mt. Sanitas loop is in Sunshine Canyon. Geo-Political Entity
GPE countries, states, provinces
Palo Alto is raising the fees for parking. Facility
FAC bridges, buildings, airports
Consider the Golden Gate Bridge. Vehicles
VEH planes, trains, automobiles
It was a classic Ford Falcon.
Figure 18.1 A list of generic named entity types with the kinds of entities they refer to.
30
Example from PubTator: https://www.ncbi.nlm.nih.gov/research/pubtator/?view=publication&pmid=32939514
chemical disease species gene
30
Example from PubTator: https://www.ncbi.nlm.nih.gov/research/pubtator/?view=publication&pmid=32939514
chemical disease species gene
31
Name Possible Categories Washington Person, Location, Political Entity, Organization, Vehicle Downing St. Location, Organization IRA Person, Organization, Monetary Instrument Louis Vuitton Person, Organization, Commercial Product
PER
[ORG Washington] went up 2 games to 1 in the four-game series. Blair arrived in [LOC Washington] for what may well be his last state visit. In June, [GPE Washington] passed a primary seatbelt law. The [VEH Washington] had proved to be a leaky ship, every passage I made...
32
[ORG American Airlines], a unit of [ORG AMR Corp.], immediately matched the move, spokesman [PER Tim Wagner] said. Words IOB Label IO Label American B-ORG I-ORG Airlines I-ORG I-ORG , O O a O O unit O O
O O AMR B-ORG I-ORG Corp. I-ORG I-ORG , O O immediately O O matched O O
(or, BIO) also, IOBES / BILOU:
Cu
B-Material
foils
L-Material
were
O
placed
U-Operation
inside
O
a
O
quartz
B-Apparatus
tube
I-Apparatus
furnace
L-Apparatus
33
identity of wi, identity of neighboring words embeddings for wi, embeddings for neighboring words part of speech of wi, part of speech of neighboring words base-phrase syntactic chunk label of wi and neighboring words presence of wi in a gazetteer wi contains a particular prefix (from all prefixes of length ≤ 4) wi contains a particular suffix (from all suffixes of length ≤ 4) wi is all upper case word shape of wi, word shape of neighboring words short word shape of wi, short word shape of neighboring words presence of hyphen
33
identity of wi, identity of neighboring words embeddings for wi, embeddings for neighboring words part of speech of wi, part of speech of neighboring words base-phrase syntactic chunk label of wi and neighboring words presence of wi in a gazetteer wi contains a particular prefix (from all prefixes of length ≤ 4) wi contains a particular suffix (from all suffixes of length ≤ 4) wi is all upper case word shape of wi, word shape of neighboring words short word shape of wi, short word shape of neighboring words presence of hyphen
34
multi-token entities, not single words. American Airlines said Friday …
34
multi-token entities, not single words. American Airlines said Friday …
B-ORG O O O
35
combine them.
Image from: Lample et al. Neural Architectures for Named Entity Recognition. NAACL 2016.
36
5 total for the semester).