Natural Language Processing with Python
CS372: Spring, 20 15 Lecture 12 Categorizing and Tagging Words
Jong C. Park Department of Computer Science Korea Advanced Institute of Science and Technology
Natural Language Processing with Python CS372: Spring, 20 15 - - PowerPoint PPT Presentation
Natural Language Processing with Python CS372: Spring, 20 15 Lecture 12 Categorizing and Tagging Words Jong C. Park Department of Computer Science Korea Advanced Institute of Science and Technology CATEGORIZING AND TAGGING WORDS Using a
CS372: Spring, 20 15 Lecture 12 Categorizing and Tagging Words
Jong C. Park Department of Computer Science Korea Advanced Institute of Science and Technology
Using a Tagger Tagged Corpora Mapping Words to Properties Using Python Dictionaries Automatic Tagging N-Gram Tagging Transformation-based Tagging How to Determine the Category of a Word
2015-04-09 CS372: NLP with Python 2
2015-04-09
CS372: NLP with Python 3
2015-04-09 CS372: NLP with Python 4
dictionary data type
2015-04-09 CS372: NLP with Python 5
2015-04-09 CS372: NLP with Python 6
Other names for dictionary are map, hashmap, hash, and associative array.
2015-04-09 CS372: NLP with Python 7
The mapping is from a “word” to some structured object.
2015-04-09 CS372: NLP with Python 8
pos is defined as an empty dictionary.
2015-04-09 CS372: NLP with Python 9
We can employ the keys to retrieve values. Question:
If the dictionary is not big, we can simply inspect its contents by evaluating the variable pos.
2015-04-09 CS372: NLP with Python 10
2015-04-09 CS372: NLP with Python 11
When we look something up in a dictionary, we
However, there is a way of storing multiple
We may use a list value, e.g., pos[‘sleep’] = [‘N’,
2015-04-09 CS372: NLP with Python 12
2015-04-09 CS372: NLP with Python 13
We can use the same key-value pair format to
Dictionary keys must be immutable types, such
2015-04-09 CS372: NLP with Python 14
When we access a non-existent entry, it is automatically added to the dictionary. int, float, str, list, dict, tuple
If we try to access a key that is not in a dictionary,
Since Python 2.5, a special kind of dictionary
2015-04-09 CS372: NLP with Python 15
We can replace low frequency words with a special “out of vocabulary” token.
2015-04-09 CS372: NLP with Python 16
2015-04-09 CS372: NLP with Python 17
2015-04-09 CS372: NLP with Python 18
itemgetter(n) returns a function that can be called on some other sequence object to obtain the nth element.
2015-04-09 CS372: NLP with Python 19
2015-04-09 CS372: NLP with Python 20
2015-04-09 CS372: NLP with Python 21
2015-04-09 CS372: NLP with Python 22
2015-04-09 CS372: NLP with Python 23
2015-04-09 CS372: NLP with Python 24
2015-04-09 CS372: NLP with Python 25
>>> from nltk.corpus import brown >>> brown_tagged_sents = brown.tagged_sents(categories=‘news’) >>> brown_sents = brown.sents(categories=‘news’)
2015-04-09 CS372: NLP with Python 26
Pros?
2015-04-09 CS372: NLP with Python 27
2015-04-09 CS372: NLP with Python 28
2015-04-09 CS372: NLP with Python 29
2015-04-09 CS372: NLP with Python 30
2015-04-09 CS372: NLP with Python 31
2015-04-09
CS372: NLP with Python 32
2015-04-09 CS372: NLP with Python 33
CS372: NLP with Python 34
2015-04-09