introduction to machine learning classification and the
play

Introduction to Machine Learning: Classification and The Noisy - PowerPoint PPT Presentation

Introduction to Machine Learning: Classification and The Noisy Channel Model CMSC 473/673 UMBC Some slides adapted from 3SLP Outline Classification Why incorporate uncertainty Classification with Bayes Rule Example: Email Classifier


  1. Introduction to Machine Learning: Classification and The Noisy Channel Model CMSC 473/673 UMBC Some slides adapted from 3SLP

  2. Outline Classification Why incorporate uncertainty Classification with Bayes Rule Example: Email Classifier Evaluation

  3. Probabilistic Classification π‘ž 𝑍 π‘Œ) = β„Ž(π‘Œ; 𝑍) Directly model the posterior Discriminatively trained classifier Model the π‘ž 𝑍 π‘Œ) ∝ π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) posterior with Bayes rule Generatively trained classifier

  4. Outline Classification Why incorporate uncertainty Classification with Bayes Rule Example: Email Classifier Evaluation

  5. Classification P OLITICS T ERRORISM Three people have been fatally shot, and five S PORTS people, including a mayor, were seriously wounded T ECH as a result of a Shining Path attack today against a H EALTH community in Junin department, central F INANCE Peruvian mountain region. …

  6. Classification P OLITICS T ERRORISM Three people have been fatally shot, and five S PORTS people, including a mayor, were seriously wounded T ECH as a result of a Shining Path attack today against a H EALTH community in Junin department, central F INANCE Peruvian mountain region. …

  7. Classification P OLITICS Electronic alerts have T ERRORISM been used to assist the authorities in moments of S PORTS chaos and potential danger: after the Boston T ECH bombing in 2013, when the Boston suspects were H EALTH still at large, and last month in Los Angeles, F INANCE during an active shooter scare at the airport. … Source: http://www.nytimes.com/2016/09/20/nyregion/cellphone-alerts-used-in-search-of- manhattan-bombing-suspect.html

  8. Classification P OLITICS Electronic alerts have T ERRORISM been used to assist the authorities in moments of S PORTS chaos and potential danger: after the Boston T ECH bombing in 2013, when the Boston suspects were H EALTH still at large, and last month in Los Angeles, F INANCE during an active shooter scare at the airport. … Source: http://www.nytimes.com/2016/09/20/nyregion/cellphone-alerts-used-in-search-of- manhattan-bombing-suspect.html

  9. Classify with Uncertainty Use probabilities

  10. Classify with Uncertainty Use probabilities* *There are non- probabilistic ways to handle uncertainty… but probabilities sure are handy!

  11. Classification P OLITICS .05 Electronic alerts have T ERRORISM .48 been used to assist the authorities in moments of S PORTS .0001 chaos and potential danger: after the Boston T ECH .39 bombing in 2013, when the Boston suspects were H EALTH .0001 still at large, and last month in Los Angeles, F INANCE .0002 during an active shooter scare at the airport. … Source: http://www.nytimes.com/2016/09/20/nyregion/cellphone-alerts-used-in-search-of- manhattan-bombing-suspect.html

  12. Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification

  13. Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input : a document a fixed set of classes C = { c 1 , c 2 ,…, c J } Output : a predicted class c from C

  14. Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input : a document linguistic blob a fixed set of classes C = { c 1 , c 2 ,…, c J } Output : a predicted class c from C

  15. Text Classification: Hand-coded Rules? Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Rules based on combinations of words or other features spam: black-list- address OR (β€œdollars” AND β€œhave been selected”) Accuracy can be high If rules carefully refined by expert Building and maintaining these rules is expensive Can humans faithfully assign uncertainty?

  16. Text Classification: Supervised Machine Learning Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input: a document d a fixed set of classes C = { c 1 , c 2 ,…, c J } A training set of m hand-labeled documents (d 1 ,c 1 ),....,(d m ,c m ) Output: a learned classifier Ξ³ that maps documents to classes

  17. Text Classification: Supervised Machine Learning Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input: NaΓ―ve Bayes a document d Logistic regression a fixed set of classes C = { c 1 , c 2 ,…, c J } A training set of m hand-labeled Support-vector documents (d 1 ,c 1 ),....,(d m ,c m ) machines Output: a learned classifier Ξ³ that maps k-Nearest Neighbors documents to classes …

  18. Text Classification: Supervised Machine Learning Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification Input: NaΓ―ve Bayes a document d Logistic regression a fixed set of classes C = { c 1 , c 2 ,…, c J } A training set of m hand-labeled Support-vector documents (d 1 ,c 1 ),....,(d m ,c m ) machines Output: a learned classifier Ξ³ that maps k-Nearest Neighbors documents to classes …

  19. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 Multi-label Classification

  20. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ {True, False} ), then a binary classification task Multi-label Classification

  21. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 βˆ’ 1} (for {True, False} ), then a finite K), then a multi-class binary classification task classification task Q: What are some examples of multi-class classification? Multi-label Classification

  22. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 βˆ’ 1} (for Single {True, False} ), then a finite K), then a multi-class output binary classification task classification task If multiple 𝑧 π‘š are Multi- predicted, then a multi- output label classification task Multi-label Classification

  23. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 βˆ’ 1} (for Single {True, False} ), then a finite K), then a multi-class output binary classification task classification task If multiple 𝑧 π‘š are Multi- predicted, then a multi- output label classification task Given input 𝑦 , predict multiple discrete labels 𝑧 = (𝑧 1 , … , 𝑧 𝑀 ) Multi-label Classification

  24. Multi-class Classification Given input 𝑦 , predict discrete label 𝑧 If 𝑧 ∈ {0,1} (or 𝑧 ∈ If 𝑧 ∈ {0,1, … , 𝐿 βˆ’ 1} (for Single {True, False} ), then a finite K), then a multi-class output binary classification task classification task If multiple 𝑧 π‘š are Each 𝑧 π‘š could be binary or Multi- predicted, then a multi- multi-class output label classification task Given input 𝑦 , predict multiple discrete labels 𝑧 = (𝑧 1 , … , 𝑧 𝑀 ) Multi-label Classification

  25. Outline Classification Why incorporate uncertainty Classification with Bayes Rule Example: Email Classifier Evaluation

  26. Probabilistic Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification class π‘ž 𝑍 π‘Œ) = π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) π‘ž(π‘Œ) observed data

  27. Probabilistic Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification prior class-based likelihood probability of (language model) class class π‘ž 𝑍 π‘Œ) = π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) π‘ž(π‘Œ) observed observation likelihood (averaged over all classes) data

  28. Probabilistic Text Classification Assigning subject Age/gender identification categories, topics, or Language Identification genres Sentiment analysis Spam detection … Authorship identification prior class-based likelihood probability of (language model) class class π‘ž 𝑍 π‘Œ) = π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) π‘ž(π‘Œ) observed observation likelihood (averaged over all classes) data

  29. Classification with Bayes Rule argmax 𝑍 π‘ž 𝑍 π‘Œ)

  30. Classification with Bayes Rule π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) argmax 𝑍 π‘ž(π‘Œ)

  31. Classification with Bayes Rule π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍) argmax 𝑍 π‘ž(π‘Œ) constant with respect to Y

  32. Classification with Bayes Rule argmax 𝑍 π‘ž π‘Œ 𝑍) βˆ— π‘ž(𝑍)

  33. Classification with Bayes Rule argmax 𝑍 log π‘ž π‘Œ 𝑍) + log π‘ž(𝑍)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend