Understanding Short Text xts ACL 2016 Tutorial Zhongyuan Wang - PowerPoint PPT Presentation

Understanding Short Text xts ACL 2016 Tutorial Zhongyuan Wang (Microsoft Research) Haixun Wang (Facebook Inc.) Tutorial Website : http://www.wangzhongyuan.com/tutorial/ACL2016/Understanding-Short-Texts/

Outline • Part 1: Challenges • Part 2: Explicit representation • Part 3. Implicit representation • Part 4: Conclusion

Short Text • Search Query • Document Title • Ad keyword • Caption • Anchor text • Question • Image Tag • Tweet/Weibo

Challenges • First , short texts contain limited context (a) By Traffic (b) By # of distinct queries 1.08% 2.55% 1.83% 4.45% 2.94% 9.67% 1 word 1 word 5.45% 3.68% 2 words 2 words 5.73% 13.57% 3 words 3 words 4 words 4 words 8.65% 39.72% 5 words 5 words 8.87% 6 words 6 words 21.06% 14.24% 7 words 7 words 13.94% 8 words 8 words more than 8 words more than 8 words 23.53% 19.06% Based on Bing query log between 06/01/2016 and 06/30/2016

Challenges • Second, “telegraphic”: no word order, no function words, no capitalization, … Query “ Distance between Sun and Earth ” can also be expressed as: • • • "how far" earth sun distance from earth to the how far away is the sun • "how far" sun sun from earth • • • "how far" sun earth distance from sun to earth how far away is the sun • • average distance earth sun distance from sun to the from the earth • • average distance from earth earth how far earth from sun • • to sun distance from the earth to how far earth is from the • average distance from the the sun sun • • earth to the sun distance from the sun to how far from earth is the • distance between earth & earth sun • • sun distance from the sun to the how far from earth to sun • • distance between earth and earth how far from the earth to • sun distance of earth from sun the sun • • • distance between earth and distance between earth sun distance between sun and the sun earth Hang Li, “Learning to Match for Natural Language Processing and Information Retrieval”

Challenges • Second, “telegraphic”: no word order, no function words, no capitalization, … Short Text 1 Short Text 2 Term Semantic Match Match china kong (actor) china hong kong partial no hot dog dog hot yes no the big apple tour new york tour almost no yes Berlin Germany capital no Yes DNN tool deep neural network almost no Yes tool wedding band band for wedding partial no why are windows so why are macs so partial no expensive expensive

Challenges • Sparse, noisy, ambiguous watch for kids It’s not i) ii) iii) a fair trade!!

Short Text Understanding • Many applications • Search engines • Automatic question answering • Online advertising • Recommendation systems • Conversational bot • … • Traditional NLP approaches not sufficient

The big question • Humans are much powerful than machines in understanding short texts. • Our minds build rich models of the world and make strong generalizations from input data that is sparse, noisy, and ambiguous – in many ways far too limited to support the inferences we make. • How do we do it?

If the mind goes beyond the data given, another source of information must make up the difference. Science 331 , 1279 (2011);

Explicit Implicit (Logic) (Embedding) Representation Representation Distributional semantics Symbolic knowledge ( Explicit ) ( Implicit )

Explicit Knowledge Representation • First, understand superlatives —”tallest,” “largest,” etc. — and ordered items. So you can ask: “ Who are the tallest Mavericks players? ” “ What are the largest cities in Texas? ” “ What are the largest cities in Iowa by area? ” • Second, have you ever wondered about a particular point in time ? Google now do a much better job of understanding questions with dates in them. So you can ask: “What was the population of Singapore in 1965?” “What songs did Taylor Swift record in 2014?” “What was the Royals roster in 2013?” • Finally, Google starts to understand some complex combinations . So Google can now respond to questions like: “What are some of Seth Gabel's father -in- law's movies?” “What was the U.S. population when Bernie Sanders was born?” “Who was the U.S. President when the Angels won the World Series?” http://insidesearch.blogspot.com/2015/11/the-google-app-now-understands-you.html

Explicit Knowledge Representation • Vector Representation • Logic Representation • ESA: Mapping text to Wikipedia (First-order-logic) article titles • Freebase, Google • Conceptualization: Mapping text knowledge Graph… to concept space P(concept | short text) a domain millions of concepts search query, anchor text used in day to day communication twitter, ads keywords, … True or False Probabilistic Model

Explicit Knowledge Representation • Vector Representation • Logic Representation • ESA: Mapping text to Wikipedia (First-order-logic) article titles • Freebase, Google • Conceptualization: Mapping text Pros: knowledge Graph… • The results are easy to understand for human beings to concept space • Easy to tune and customize Cons: P(concept | short text) • Coverage/Sparse model : can’t handle unseen terms/entities/relations • Model size : usually very large a domain millions of concepts search query, anchor text used in day to day communication twitter, ads keywords, … True or False Probabilistic Model

Implicit Knowledge Representation: Embedding GloVe Deep Structured Semantic Model (DSSM) CW08 (SENNA) https://code.google.com/p/word2vec/ Input units: Tri-letter Training size: ~20B clicks (Bing + IE log) Vocabulary: 30K Parameter: ~10M KNET Input units: word Input units: word Training size: > 100B sequence (Freebase) Vocabulary: 130k Input units: word Vocabulary: > 2M Collobert, Ronan, et al. "Natural Training size: > 42B tokens Tomas Mikolov, Kai Chen, Greg Corrado, and language processing (almost) from Vocabulary: > 400K Jeffrey Dean. Efficient Estimation of Word scratch." The Journal of Machine Representations in Vector Space. In Huang, Po-Sen, et al. "Learning deep J Pennington, R Socher , CD Manning “Glove: Learning Research 12 (2011): Proceedings of Workshop at ICLR, 2013. structured semantic models for web Global Vectors for Word Representation.” 2493-2537. search using clickthrough data." in CIKM. EMNLP 2014. ACM, 2013. Count + Predict Predict

Implicit Knowledge Representation: Embedding GloVe Deep Structured Semantic Model (DSSM) CW08 Pros: (SENNA) https://code.google.com/p/word2vec/ • Dense semantic encoding • A good representation framework • Facilitates computation (similarity measure) Cons: Input units: Tri-letter • Training size: ~20B clicks (Bing + IE log) Perform poorly for rare words and new words Vocabulary: 30K Parameter: ~10M • Missing relations (e.g, isA, isPropertyOf) • Hard to tune since it’s not nature for human beings KNET Input units: word Input units: word Training size: > 100B sequence (Freebase) Vocabulary: 130k Input units: word Vocabulary: > 2M Collobert, Ronan, et al. "Natural Training size: > 42B tokens Tomas Mikolov, Kai Chen, Greg Corrado, and language processing (almost) from Vocabulary: > 400K Jeffrey Dean. Efficient Estimation of Word scratch." The Journal of Machine Representations in Vector Space. In Huang, Po-Sen, et al. "Learning deep J Pennington, R Socher , CD Manning “Glove: Learning Research 12 (2011): Proceedings of Workshop at ICLR, 2013. structured semantic models for web Global Vectors for Word Representation.” 2493-2537. search using clickthrough data." in CIKM. EMNLP 2014. ACM, 2013. Count + Predict Predict

Implicit Knowledge Representation: DNN Stanford Deep Autoencoder for Paraphrase Detection [Soucher et al. 2011] Facebook DeepText classifier [Zhang et al. 2015] Stanford MV-RNN for Sentiment Analysis [Soucher et al. 2012]

New Trend: Fusion of Explicit and Implicit knowledge • Relationship • Rules of inference Teach Explicit Implicit (Logic) (Embedding) Representation Representation Distributional semantics Symbolic knowledge Learn • Learn more similar rules, enrich logic representation

Understanding Short Text xts ACL 2016 Tutorial Zhongyuan Wang - PowerPoint PPT Presentation

Understanding Short Text xts ACL 2016 Tutorial Zhongyuan Wang (Microsoft Research) Haixun Wang (Facebook Inc.) Tutorial Website : http://www.wangzhongyuan.com/tutorial/ACL2016/Understanding-Short-Texts/ Outline Part 1: Challenges Part

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Post-quantum Security of the CBC, CFB, OFB, Modes of Operation. CTR, and XTS Modes of Operation.

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

God Rescues Daniel from the Lions Daniel 6 Here is some test text Here is some test text Here

5. Text CHAPTER HIGHLIGHTS Text tradition. Codes for computer text. C d f t t t

Stack Stack Heap Heap Data Data Text Text Program A Program B Stack Stack Text Heap

Business Proposal Infographic Style Your Text Here Your Text Here Your Text Here Your Text

How to Stay Faithful in Exile Daniel 1 Here is some test text Here is some test text Here is

Nehemiah Prays Nehemiah 1-2 Here is some test text Here is some test text Here is some test

health in in the urban context xts: Challenges and strategies Budi Utomo Faculty of Public

Learning to reason by reading text and answering questions Minjoon Seo Natural Language

A Knowledge Theory of Tacit Agreement Wentong Zheng Univ. of Florida Levin College of Law

What is an explicit b ection? Andrej Bauer Faculty of mathematics and Physics University of

D-I-K-W-M & C-A-D-P-O-M Gu Jifa Academy of Mathematics and Systems Science, Chinese

Introduction CptS 570 Machine Learning School of EECS Washington State University What is

Machine Learning Jrg Denzinger, ICT 752, denzinge@cpsc.ucalgary.ca 0. Organizational Stuff

v6ops and security IPv6 Transition/Co-existence Security Considerations

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Understanding Short Text xts ACL 2016 Tutorial Zhongyuan Wang - PowerPoint PPT Presentation

Understanding Short Text xts ACL 2016 Tutorial Zhongyuan Wang (Microsoft Research) Haixun Wang (Facebook Inc.) Tutorial Website : http://www.wangzhongyuan.com/tutorial/ACL2016/Understanding-Short-Texts/ Outline Part 1: Challenges Part

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Post-quantum Security of the CBC, CFB, OFB, Modes of Operation. CTR, and XTS Modes of Operation.

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

God Rescues Daniel from the Lions Daniel 6 Here is some test text Here is some test text Here

5. Text CHAPTER HIGHLIGHTS Text tradition. Codes for computer text. C d f t t t

Stack Stack Heap Heap Data Data Text Text Program A Program B Stack Stack Text Heap

Business Proposal Infographic Style Your Text Here Your Text Here Your Text Here Your Text

How to Stay Faithful in Exile Daniel 1 Here is some test text Here is some test text Here is

Nehemiah Prays Nehemiah 1-2 Here is some test text Here is some test text Here is some test

health in in the urban context xts: Challenges and strategies Budi Utomo Faculty of Public

Learning to reason by reading text and answering questions Minjoon Seo Natural Language

A Knowledge Theory of Tacit Agreement Wentong Zheng Univ. of Florida Levin College of Law

What is an explicit b ection? Andrej Bauer Faculty of mathematics and Physics University of

D-I-K-W-M &amp; C-A-D-P-O-M Gu Jifa Academy of Mathematics and Systems Science, Chinese

Introduction CptS 570 Machine Learning School of EECS Washington State University What is

Machine Learning Jrg Denzinger, ICT 752, denzinge@cpsc.ucalgary.ca 0. Organizational Stuff

v6ops and security IPv6 Transition/Co-existence Security Considerations

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

D-I-K-W-M & C-A-D-P-O-M Gu Jifa Academy of Mathematics and Systems Science, Chinese