Text Classification and Sentiment Analysis Alejandro Moreo AFIRM - - PowerPoint PPT Presentation

text classification and sentiment analysis
SMART_READER_LITE
LIVE PREVIEW

Text Classification and Sentiment Analysis Alejandro Moreo AFIRM - - PowerPoint PPT Presentation

Text Classification and Sentiment Analysis Alejandro Moreo AFIRM 16th January 2019 Alejandro Moreo Text Classification and Sentiment Analysis Overview The Toolkit: scikit-learn The Environment: Jupyter Guided Exercise: topic classification


slide-1
SLIDE 1

Text Classification and Sentiment Analysis

Alejandro Moreo

AFIRM

16th January 2019

Alejandro Moreo Text Classification and Sentiment Analysis

slide-2
SLIDE 2

Overview

The Toolkit: scikit-learn The Environment: Jupyter Guided Exercise: topic classification Hands-on Activities: sentiment classification Concluding Remarks

Alejandro Moreo Text Classification and Sentiment Analysis

slide-3
SLIDE 3

The Toolkit

Alejandro Moreo Text Classification and Sentiment Analysis

slide-4
SLIDE 4

The Toolkit

Alejandro Moreo Text Classification and Sentiment Analysis

slide-5
SLIDE 5

Plan of the Hands-on activities

We will explore scikit-learn’s tools for text analysis and text mining that instantiate the most important methods described in the lectures. Guided exercise: text classification by topic

Loading datasets: 20 Newsgroups Data preprocessing: n-grams extraction, stop-words removal, and stemming with NLTK Corpus representation: tf-idf vectorial representation Learning a classifier: Support Vector Machines Test and Evaluation of results

Alejandro Moreo Text Classification and Sentiment Analysis

slide-6
SLIDE 6

Exercises

The participants will create and optimize their own sentiment classifier. Concretely, we will explore: 1 Feature Selection: χ2-based filtering 2 Weighting Functions: binary, tf, tf-idf, ... 3 Parameter Optimization: get the most of the classifier 4 Comparing Learners: Logistic regression, k-NN, Naive Bayes, ... 5 Competition!

... let’s get started!

Alejandro Moreo Text Classification and Sentiment Analysis