Natural Language Processing CSCI 4152/6509 — Lecture 12 Classifier Evaluation
Instructor: Vlado Keselj Time and date: 09:35–10:25, 31-Jan-2020 Location: Dunn 135
CSCI 4152/6509, Vlado Keselj Lecture 12 1 / 29
Natural Language Processing CSCI 4152/6509 Lecture 12 Classifier - - PowerPoint PPT Presentation
Natural Language Processing CSCI 4152/6509 Lecture 12 Classifier Evaluation Instructor: Vlado Keselj Time and date: 09:3510:25, 31-Jan-2020 Location: Dunn 135 CSCI 4152/6509, Vlado Keselj Lecture 12 1 / 29 Previous Lecture IR
CSCI 4152/6509, Vlado Keselj Lecture 12 1 / 29
◮ Precision, Recall, F-measure
◮ Text classification as a text mining problem ◮ Types of text classification CSCI 4152/6509, Vlado Keselj Lecture 12 2 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 3 / 29
a+b), recall ( a a+c), fallout ( b b+d),
CSCI 4152/6509, Vlado Keselj Lecture 12 4 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 5 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 6 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 7 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 8 / 29
◮ Underfitting and Overfitting
◮ Underfitting and Overfitting CSCI 4152/6509, Vlado Keselj Lecture 12 9 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 10 / 29
training data training classifier testing data evaluation
CSCI 4152/6509, Vlado Keselj Lecture 12 11 / 29
classifier 1 fold 3 fold 2 . . . fold 1 fold n fold n−1 evaluation training fold n−1 fold 3 fold 2 evaluation training fold n fold 1 . . . fold 3 fold 2 . . . fold 1 evaluation training fold n fold n−1 classifier 2 classifier n
CSCI 4152/6509, Vlado Keselj Lecture 12 12 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 13 / 29
◮ cosine similarity, ◮ Euclidean similarity, or ◮ some other type of vector similarity CSCI 4152/6509, Vlado Keselj Lecture 12 14 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 15 / 29
f1(g)+f2(g) 2
CSCI 4152/6509, Vlado Keselj Lecture 12 16 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 17 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 18 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 19 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 20 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 21 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 22 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 23 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 24 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 25 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 26 / 29
CSCI 4152/6509, Vlado Keselj Lecture 12 27 / 29
f1(n)+f2(n) 2
CSCI 4152/6509, Vlado Keselj Lecture 12 28 / 29
◮ done by merging all texts in each class into one
◮ another option: centroid of profiles of
CSCI 4152/6509, Vlado Keselj Lecture 12 29 / 29