cs325 artificial intelligence natural language processing
play

CS325 Artificial Intelligence Natural Language Processing I (Ch. 22) - PowerPoint PPT Presentation

CS325 Artificial Intelligence Natural Language Processing I (Ch. 22) Dr. Cengiz Gnay, Emory Univ. Spring 2013 Gnay Natural Language Processing I (Ch. 22) Spring 2013 1 / 30 AI in Natural Language Processing (NLP) Whats NLP? Gnay


  1. Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30

  2. Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Words are independent? Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30

  3. Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Words are independent? Called unigram or 1 -gram : Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30

  4. Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Words are independent? Called unigram or 1 -gram : � P ( w 1 , w 2 , . . . , w n ) = P ( w i ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30

  5. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  6. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > < P (” will ” | ” I hello say ”) Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  7. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > P (” will ” | ” I hello say ”) Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  8. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > P (” will ” | ” I hello say ”) Words dependent on previous words: called N -gram Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  9. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > P (” will ” | ” I hello say ”) Words dependent on previous words: called N -gram P ( w 1 , w 2 , . . . , w n ) = P ( w 1 : n ) � = P ( w i | w 1 :( i − 1 ) ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  10. Must Remember All Words That Came Before? P (” 1752 ” | ” Thomas Bayes . . . ”) =? Günay Natural Language Processing I (Ch. 22) Spring 2013 9 / 30

  11. Must Remember All Words That Came Before? P (” 1752 ” | ” Thomas Bayes . . . ”) =? Markov assumption: Only remember last N words: N -gram. Günay Natural Language Processing I (Ch. 22) Spring 2013 9 / 30

  12. Must Remember All Words That Came Before? P (” 1752 ” | ” Thomas Bayes . . . ”) =? Markov assumption: Only remember last N words: N -gram. k � P ( w 1 : k ) = P ( w i | w ( i − N ):( i − 1 ) ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 9 / 30

  13. Let’s Read Shakespeare. . . In Unigram Unigram=1-gram Günay Natural Language Processing I (Ch. 22) Spring 2013 10 / 30

  14. Shakespeare In Bigram N = 2: bigram Günay Natural Language Processing I (Ch. 22) Spring 2013 11 / 30

  15. Shakespeare In Trigram N = 3: trigram Günay Natural Language Processing I (Ch. 22) Spring 2013 12 / 30

  16. Shakespeare In 4-gram Günay Natural Language Processing I (Ch. 22) Spring 2013 13 / 30

  17. Shakespeare N -gram Quiz Find: 1 real quote 3x unigram picks 3x bigram picks 3x trigram picks Günay Natural Language Processing I (Ch. 22) Spring 2013 14 / 30

  18. Shakespeare N -gram Quiz Find: 1 real quote 3x unigram picks 3x bigram picks 3x trigram picks Günay Natural Language Processing I (Ch. 22) Spring 2013 14 / 30

  19. Bigram Probability Question P ( ˆ woe is me | ˆ ) =? Given that: ˆ: symbol showing start of sentence P ( woe i | ˆ i − 1 ) = . 0002 P ( is i | woe i − 1 ) = . 07 P ( me i | is i − 1 ) = . 0005 Günay Natural Language Processing I (Ch. 22) Spring 2013 15 / 30

  20. Bigram Probability Question P ( ˆ woe is me | ˆ ) = . 0002 × . 07 × . 0005 = 7 × 10 − 9 Given that: ˆ: symbol showing start of sentence P ( woe i | ˆ i − 1 ) = . 0002 P ( is i | woe i − 1 ) = . 07 P ( me i | is i − 1 ) = . 0005 Günay Natural Language Processing I (Ch. 22) Spring 2013 15 / 30

  21. Other Tricks Stationarity assumption: Context doesn’t change over time. Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30

  22. Other Tricks Stationarity assumption: Context doesn’t change over time. Smoothing: Remember Laplace smooting? Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30

  23. Other Tricks Stationarity assumption: Context doesn’t change over time. Smoothing: Remember Laplace smooting? Hidden variables: E.g., identify what a “noun” is. Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30

  24. Other Tricks Stationarity assumption: Context doesn’t change over time. Smoothing: Remember Laplace smooting? Hidden variables: E.g., identify what a “noun” is. Use abstractions: Group “New York City”, or just look at letters. Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30

  25. Smaller Than Words? What if we cannot distinguish words? Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  26. Smaller Than Words? What if we cannot distinguish words? Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  27. Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  28. Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Segmentation: Dividing into words. Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  29. Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Segmentation: Dividing into words. Use Bayes again: s ∗ = max P ( w 1 : n ) = max � P ( w i | w 1 : i ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  30. Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Segmentation: Dividing into words. Use Bayes again: s ∗ = max P ( w 1 : n ) = max � P ( w i | w 1 : i ) i Or Markov assumption (e.g., unigram): s ∗ = max � P ( w i ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  31. Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30

  32. Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? 1 n − 1 2 ( n − 1 ) 2 3 ( n − 1 )! 4 2 n − 1 5 ( n − 1 ) n Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30

  33. Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? 1 n − 1 2 ( n − 1 ) 2 3 ( n − 1 )! 4 2 n − 1 5 ( n − 1 ) n Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30

  34. Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? 1 n − 1 2 ( n − 1 ) 2 3 ( n − 1 )! 4 2 n − 1 5 ( n − 1 ) n Solution: Separate each character with n − 1 divisions, form words by whether division exists or not. Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30

  35. Reducing Segmentation Complexity Exploit independence: ”nowisthetime” ? Günay Natural Language Processing I (Ch. 22) Spring 2013 19 / 30

  36. Reducing Segmentation Complexity Exploit independence: ”nowisthetime” ? Divide into first, f , and recurse for rest, r : s ∗ = max s = f + r P ( f ) · s ∗ ( r ) Günay Natural Language Processing I (Ch. 22) Spring 2013 19 / 30

  37. Reducing Segmentation Complexity Exploit independence: ”nowisthetime” ? Divide into first, f , and recurse for rest, r : s ∗ = max s = f + r P ( f ) · s ∗ ( r ) Gives 99% accuracy and easy implementation ! Günay Natural Language Processing I (Ch. 22) Spring 2013 19 / 30

  38. Segmentation Problems Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30

  39. Segmentation Problems How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30

  40. Segmentation Problems How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30

  41. Segmentation Problems How can we improve? 1 More Data 2 Markov 3 Smoothing Need to get the context . Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30

  42. Segmentation Problems (2) Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  43. Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  44. Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  45. Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  46. Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Need to know more words . Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  47. What Else Can We Do with Letters? Language identification? Günay Natural Language Processing I (Ch. 22) Spring 2013 22 / 30

  48. What Else Can We Do with Letters? Language identification? Günay Natural Language Processing I (Ch. 22) Spring 2013 22 / 30

  49. Bigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 23 / 30

  50. Bigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 23 / 30

  51. Trigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30

  52. Trigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30

  53. Trigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30

  54. Trigram Recognition with Letters 99% accuracy from trigrams! Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30

  55. Can We Identify Categories Too? Günay Natural Language Processing I (Ch. 22) Spring 2013 25 / 30

  56. Can We Identify Categories Too? Text classification Günay Natural Language Processing I (Ch. 22) Spring 2013 25 / 30

  57. Text Classification What algorithms can we use? Naive Bayes: Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  58. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  59. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  60. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Similar words Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  61. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Similar words Support Vector Machines: Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  62. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Similar words Support Vector Machines: Supervised learning Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend