cer complementary entity recognition
play

CER: Complementary Entity Recognition via Knowledge Expansion on - PowerPoint PPT Presentation

CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews Hu Xu , University of Illinois at Chicago Sihong Xie, Lehigh University Lei Shu, University of Illinois at Chicago Philip S. Yu, University of


  1. CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews Hu Xu , University of Illinois at Chicago Sihong Xie, Lehigh University Lei Shu, University of Illinois at Chicago Philip S. Yu, University of Illinois at Chicago, Tsinghua University BigData ’16

  2. My Black Friday Experience

  3. I have a case and want to add a new GPU.

  4. This is how a compatible GPU should look like.

  5. I found a 1070 GPU with a good price on Newegg.

  6. Unfortunately, the GPU is too long to fit in… and non-refundable.

  7. In the end, I damaged my case a little…

  8. What’s a better way to avoid this?

  9. So we need to identify the fact that the GPU does not like some cases…

  10. Roadmaps • Preliminary: Complementary Entity Recognition � • Method Overview • Basic Recognition via Dependency Paths • Knowledge Expansion on a Large Amount of Reviews • Experiments • Conclusions

  11. Sentiment Analysis on Reviews (Liu, 2012) • Product reviews contain a huge amount of information about first-hand user experiences in a sadly unstructured text format. • Aspect-level sentiment analysis on product reviews is a key task to understand customers’ opinions on opinion targets: products and aspects (features) of products. • We focus on complementary entities in reviews. �

  12. What’s an Entity? • Something that has separate and distinct existence and objective or conceptual reality. —Merriam-Webster • • We are interested in entities related to products. • Named Entity • e.g., Samsung Galaxy S6, Microsoft Surface • General Entity • tablet, cellphone, computer, etc.

  13. What’s a Complementary Entity? • Customers also express their opinions on a relation between a reviewed product and another product. • One relation type is complementary relation : two products (entities) should work together. • Definition: • target entity : the reviewed product; • complementary entity : the related product in a complementary relation. � • Example: • This card works with my phone .

  14. A Few Examples

  15. A Few Examples with Opinions

  16. Complementary Entity Recognition (CER) • Extract complementary entities from sentences of reviews. (The target entities can be obtained from product titles of the reviewed product) • e.g., extract phone from “It works with my phone .” • Differences from Named Entity Recognition (NER): • Including general entities: e.g., case, phone. • Context dependent • e.g., “It works with my iPhone 7 ” vs “I like my iPhone 7 .” • We only focus on CER in this paper since sentiment classification is an independent task and requires different techniques.

  17. Roadmaps • Preliminary: Complementary Entity Recognition • Method Overview � • Basic CER via Dependency Paths • Knowledge Expansion on a Large Amount of Reviews • Experiments • Conclusions

  18. Method Overview • We propose an unsupervised method with two components: • Basic CER via Dependency Paths • extract complementary entities from review sentences. • Knowledge Expansion on a Large Amount of Reviews • improve the precision by using the knowledge expanded on a large amount of reviews • the contexts of complementary entities can be noisy • high quality knowledge can reduce such noise via high precision paths.

  19. Method Overview Knowledge Expansion Domain Reviews Dependency Paths � Complementary � Entity for CER Test Review

  20. Noisy Context of Complementary Entity • Taking a micro SD card as the target entity for example: • Similar context can be used for other purposes: • It works in my phone . • It works in practice . • It works in airplane mode . • The verbs used in the context of complementary entities are unlimited and domain related: • It works with my phone . • I use it with my phone . • I insert this card into my phone . • This card like my phone .

  21. Knowledge Expansion • We introduce two kinds of knowledge to help to filter out noises: • Candidate Complementary Entities: • e.g., Samsung Galaxy S6 , MS Surface , phone , tablet , etc. for micro SD card • Domain-Specific Verbs: • e.g., use , work , fit , insert for micro SD card � • We observe that products under the same category share similar context knowledge. • We group reviews under the same category together in case the number of reviews for a specific product is limited.

  22. Roadmaps • Preliminary: Complementary Entity Recognition • Method Overview • Basic CER via Dependency Paths � • Knowledge Expansion on a Large Amount of Reviews • Experiments • Conclusions

  23. Dependency Parsing (De Marneffe and Manning, 2008) ) • A sentence can be parsed into a tree structure with words as nodes and typed grammar relations as edges. • e.g., It works with my phone. � � �

  24. Dependency Path • Paths passing through the nodes that are complementary entities can be used to extract complementary entity. � � �

  25. Dependency Paths � � � � � • These paths (e.g., Path 6) may have low precision due to context noises and domain knowledge can help to reduce such noises.

  26. Roadmaps • Preliminary: Complementary Entity Recognition • Method Overview • Basic CER via Dependency Paths • Knowledge Expansion on a Large Amount of Reviews � • Experiments • Conclusions

  27. Knowledge Expansion • To improve CER’s precision, we use domain knowledge to filter noises. • Similarly, we use dependency paths to extract high quality knowledge from a large amount of reviews. • The dependency paths must be of high precision to ensure the quality of knowledge. • We use only seed general verbs work and fit .

  28. Knowledge Expansion work fit

  29. Knowledge Expansion Tablet work phone fit Samsung Galaxy S6 Microsoft Surface Pro 4

  30. Knowledge Expansion work Tablet use work phone insert fit Samsung Galaxy S6 fit Microsoft Surface Pro 4 like

  31. Dependency Paths for Knowledge Expansion

  32. Roadmaps • Preliminary: Complementary Entity Recognition • Method Overview • Basic CER via Dependency Paths • Knowledge Expansion on a Large Amount of Reviews • Experiments � • Conclusions

  33. Experiment Setting • Dataset: • We annotated 7 products for testing purpose; • We collect 6000 reviews for each category of products, used for knowledge expansion. • We compare 10 methods: • 2 Noun Phrase Chunkers, NER, CRF, Sceptre, “My” Entity Path, CER, CER1K+, CER3K+, CER6K+.

  34. Experiment Results • CER6K+ performs best: • F1-score is more than 70%. • CER3K+ (with 3000 domain reviews) is already good enough. • Other baselines are not designed for this task.

  35. Experiment Results

  36. Domain Knowledge

  37. Roadmaps • Preliminary: Complementary Entity Recognition (CER) • Method Overview • Basic Recognition via Dependency Paths • Knowledge Expansion on a Large Amount of Reviews • Experiments • Conclusions

  38. Conclusions • We introduce a novel task Complementary Entity Recognition (CER) and an unsupervised method for recognition. • We utilize big data to expand domain knowledge and use the domain knowledge to improve the performance of recognition. • Future works can be • sentiment classification for complementary entities; • automatic knowledge accumulation from data.

  39. Q&A • The annotated dataset can be found at: • https://www.cs.uic.edu/~hxu/CER_dataset.html • For details, please go for the original paper: • Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu, CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews, IEEE International Conference on Big Data 2016, Washington D.C., Dec 5-8, 2016.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend