SEMANTIC CLUSTERING OF QUESTIONS RESEARCH REPORT, 2 ND SEMESTER - - PowerPoint PPT Presentation

semantic clustering of questions
SMART_READER_LITE
LIVE PREVIEW

SEMANTIC CLUSTERING OF QUESTIONS RESEARCH REPORT, 2 ND SEMESTER - - PowerPoint PPT Presentation

SEMANTIC CLUSTERING OF QUESTIONS RESEARCH REPORT, 2 ND SEMESTER Cristina Groap Problem statement 2 Part of the Smart Presentation project Efficient management of audience feedback Question clustering: Suggest similar asked


slide-1
SLIDE 1

SEMANTIC CLUSTERING OF QUESTIONS

RESEARCH REPORT, 2ND SEMESTER

Cristina Groapă

slide-2
SLIDE 2

Problem statement

 Part of the Smart Presentation project  Efficient management of audience feedback  Question clustering:

 Suggest similar asked questions  Group all questions according to topic

 Important: real-time process

2

slide-3
SLIDE 3

Specificity

 Specificity = Information Content  E.g. {collie, sheepdog} vs. {go, be}  Evaluation:

 Taxonomy depth  Corpus-based

 Combine with measures of semantic similarity for

better results

3

slide-4
SLIDE 4

Semantic Similarity Measures

 Path-based

 Leacock-Chodorow:

 IC-based

 Resnik:

 Semantic Relatedness

 Hirst-and-St.Onge:

4

slide-5
SLIDE 5

NLP Tools

 Stanford CoreNLP  LingPipe  Java Wordet::Similarity

5

slide-6
SLIDE 6

Implementation

6

slide-7
SLIDE 7

Results

 143 questions ~ 8 min (dualCore 2GHz processor,

3GB RAM)

 Good:  Bad:

7

slide-8
SLIDE 8

Results (2)

 Good and bad:

8

slide-9
SLIDE 9

Future work

 Test on real data  Increase weight on NERs compared to

common nouns

 Introduce specificity  Word Sense Disambiguation

9

slide-10
SLIDE 10

Thank you

10

 Questions?