CQARank:Jointly Model Topics and Expertise in Community Question - PowerPoint PPT Presentation

CQARank:Jointly Model Topics and Expertise in Community Question Answering Liu Yang, Minghui Qiu, Swapna Gottipati, Feida Zhu, Jing Jiang, Huiping Sun, Zhong Chen Peking University Singapore Management University

Community Question Answering • Open platforms for sharing expertise • Large repositories of valuable knowledge CIKM2013 2

Existing CQA Mechanism Challenges • Poor expertise matching • Low-quality answers • Under-utilized archived questions • Fundamental question: how to model topics and expertise in CQA sites CIKM2013 3

Motivation • A case study of Stack Overflow Question Tag Vote User Answer CIKM2013 4

Motivation • Propose a principle approach to jointly model topics and expertise in CQA – No one is expert in all topical interests – Each new question should be routed to answerers interested in related topics with the right level of expertise • Achieve better understanding of both user topical interest and expertise by leveraging tagging and voting information – Tags are important user-generated category information of Q&A posts – Votes indicate a CQA community’s long term review result for a given user’s expertise under a specific topic CIKM2013 5

Roadmap • Motivation • Related Work • Our Method – Method Overview – Topic Expertise Model – CQARank • Experiments • Summery CIKM2013 6

Related Work • Link Analysis – HITS (Jurczyk and Agichtein, CIKM07) – Expertise Rank and Z-score (Zhang et al., WWW07) – Find global experts without model of user interests • Latent Topical Analysis – UQA Model ( Guo et al. CIKM08) – Fail to capture to what extent these users’ expertise match the questions with similar topical interest • Topic Sensitive PageRank – TwitterRank (Weng et al. WSDM10) – Topic-sensitive probabilistic model for expert finding (Zhou et al. CIKM12) CIKM2013 7

Method Overview • Concepts – Topical Interest – Topical Expertise – Q&A Graph • Our Approach – Topic Expertise Model – CQARank to combine learning results from TEM with link analysis of Q&A graph CIKM2013 9

Method Overview • CQARank Recommendation Framework CIKM2013 10

Topic Expertise Model User topical expertise distribution User specific topic distribution β 𝜚 𝑙,𝑣 𝜄 𝑣 α U K*U • 𝑉 : # of users • 𝑂 𝑣 : # of posts • 𝑀 𝑣,𝑜 : # of words e z • 𝑄 𝑣,𝑜 : # of tags • z: topic label • e: expertise label v w t • v: a vote P L u,n u,n • w: a word N u U • t: a tag Expertise Topic specific 𝜈 𝑓 Σ e specific vote φ 𝑙 𝜔 𝑙 word and tag distribution distribution E K K η 𝜈 0 𝑙 0 𝛽 0 𝛾 0 γ CIKM2013 12

CQARank • CQARank combines textual content learning result of TEM with link analysis to enforce user topical expertise learning • Construct Q&A Graph 𝐻 = (𝑊, 𝐹) – 𝑊 is a set of nodes representing users – 𝐹 is a set of directed edges from the asker to the answerer • 𝑓 = 𝑣 𝑗 , 𝑣 𝑘 𝑣 𝑗 ∈ 𝑊, 𝑣 𝑘 ∈ 𝑊 • Weight 𝑋 𝑗𝑘 is the number of all answers answered by 𝑣 𝑘 for questions of 𝑣 𝑗 CIKM2013 14

CQARank  For each topic 𝑨 , the transition probability from asker 𝑣 𝑗 to answerer 𝑣 𝑘 is defined as: 𝑋 𝑗𝑘 ∙𝑡𝑗𝑛 𝑨 (𝑗→𝑘) 𝑋 𝑗𝑙 ∙𝑡𝑗𝑛 𝑨 (𝑗→𝑙) 𝑗𝑔 o 𝑄 𝑨 𝑗 → 𝑘 = 𝑥 𝑗,𝑛 𝑋 ≠ 0 𝑛 𝑊 Σ 𝑙=1 o 𝑄 𝑨 𝑗 → 𝑘 = 0 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓 • 𝑡𝑗𝑛 𝑨 (𝑗 → 𝑘) is the similarity between 𝑣 𝑗 and 𝑣 𝑘 under topic 𝑨 , which is defined as ′ − 𝜄 ′ o 𝑡𝑗𝑛 𝑨 𝑗 → 𝑘 = 1 − 𝜄 𝑗𝑨 𝑘𝑨 • The row-normalized transition matrix M is defined as o 𝐍 𝑗𝑘 = 𝑄 𝑨 𝑗 → 𝑘 CIKM2013 15

CQARank • Given topic 𝑨 , the CQARank saliency score of 𝑣 𝑗 is computed based on the following formula: o 𝐒 𝑨 𝑣 𝑗 = 𝜇 𝑘:𝑣 𝑘 →𝑣 𝑗 𝐒 𝑨 𝑣 𝑘 ∙ 𝐍 𝑗𝑘 + 1 − 𝜇 ∙ 𝜄 𝑣 𝑗 𝑨 ∙ 𝐅(𝑨, 𝑣 𝑗 ) o 𝐅(𝑨, 𝑣 𝑗 ) is the estimated expertise score of 𝑣 𝑗 under topic 𝑨 , which is defined as the expectation of user topical expertise distribution learnt by TEM. 𝐅 𝑨, 𝑣 𝑗 = 𝜚 𝑨,𝑣 𝑗 ,𝑓 ∙ 𝜈 𝑓 𝑓 o 𝜇 ∈ 0,1 is a parameter to control the probability of teleportation operation. CIKM2013 16

Experiments • Stack Overflow Data Set – All Q&A posts in three months (May 1 𝑡𝑢 to August 1 𝑡𝑢 , 2009) – Training data: 8,904 questions and 96,629 answers posted by 663 users.( 10,689 unique tags and 135 unique votes) – Testing data: 1,173 questions and 9,883 answers • Data Preprocessing – Tokenize text and discard all code snippets – Remove stop words and HTML tags in text • Parameters Setting 50 – 𝐿 = 15, 𝐹 = 10, 𝛽 = 𝐿 , 𝛾 = 0.01, 𝛿 = 0.01, 𝜃 = 0.001, 𝜇 = 0.2 – Norma-Gamma parameters – 500 iterations of Gibbs Sampling CIKM2013 18

TEM Results • Topic Analysis - topic tags – Top tags provide phrase level features to distill richer topic information CIKM2013 19

TEM Results • Topic Analysis - topic words – Top words have strong correlation with top tags under the same topic CIKM2013 20

TEM Results • Expertise Analysis – TEM learns different user expertise levels by clustering votes using GMM component. – 10 Gaussian distributions with various means for the generation of votes in data. – The higher the mean is, the lower the precision is. CIKM2013 21

Recommend Expert Users • Task – Given a new question 𝑟 and a set of users 𝐕 , Rank users by their interests and expertise to answer question 𝑟 . – Recommendation score function 𝑇 𝑣, 𝑟 = 𝑇𝑗𝑛 𝑣, 𝑟 ∙ 𝐹𝑦𝑞𝑓𝑠𝑢 𝑣, 𝑟 = (1 − 𝐾𝑇(𝜄 𝑣 , 𝜄 𝑟 )) ∙ 𝜄 𝑟,𝑨 ∙ 𝐹𝑦𝑞𝑓𝑠𝑢(𝑣, 𝑨) 𝑨 – 𝜄 𝑟,𝑨 is the estimated posterior topic distribution of question 𝑟 𝜄 𝑟,𝑨 ∝ 𝑞 𝑨 𝐱 𝑟 , 𝐮 𝑟 , 𝑣 = 𝑞 𝑨 𝑣 𝑞 𝐱 𝑟 𝑨 𝑞 𝐮 𝑟 𝑨 = 𝜄 𝑣,𝑨 𝜒 𝑨, 𝑥 𝜔(𝑨, 𝑢) 𝑥:𝐱 𝑟 𝑢:𝐮 𝑟 CIKM2013 22

Recommend Expert Users • Our method – CQARank • Baselines Link analysis method – In Degree( ID ) – PageRank( PR ) Probabilistic generative model – TEM (Part of our method) – UQA ( Guo et al. CIKM08) Combine link analysis and topic model – Topic Sensitive PageRank( TSPR )(Zhou et al. CIKM12) CIKM2013 23

Recommend Expert Users • Evaluation Criteria – Ground truth: User rank list by average votes for answering 𝑟 – Metrics: 𝑜𝐸𝐷𝐻 , Pearson/Kendall correlation coefficients • Results CIKM2013 24

Recommend Answers • Task – Give a new question 𝑟 and a set of answers 𝐁 , Rank all answers in 𝐁 . – Recommendation score function 𝑇 𝑏, 𝑟 = 𝑇𝑗𝑛 𝑏, 𝑟 ∙ 𝐹𝑦𝑞𝑓𝑠𝑢 𝑣, 𝑟 = (1 − 𝐾𝑇(𝜄 𝑏 , 𝜄 𝑟 )) ∙ 𝜄 𝑟,𝑨 ∙ 𝐹𝑦𝑞𝑓𝑠𝑢(𝑣, 𝑨) 𝑨 • Baselines and evaluation criteria are the same with expert recommendation task • We use each answer’s vote to generate ground truth rank list CIKM2013 25

Recommend Answers • Result CIKM2013 26

Recommend Similar Questions • When a user asks a new question(referred as query question ), the user will often get replies of links to other similar questions • Crawl 1000 questions as query question set whose similar questions exist in the training data set • For each query question with 𝑜 similar questions , we randomly select another 𝑛 (m = 1000) questions from the training data set to form candidate similar questions CIKM2013 27

Recommend Similar Questions • All comparing methods rank these 𝑛 + 𝑜 candidate similar questions according to their similarity with the query question • The higher the similar questions are ranked, the better the performance of the method is. • Recommendation score is computed based on JS- divergence between topic distributions of the query question and candidate similar questions CIKM2013 28

Recommend Similar Questions • Baseline – TSPR(LDA), UQA, SimTag • Evaluation Criteria – Precision@K, Average rank of similar questions, Mean reciprocal rank (MRR), Cumulative distribution of ranks (CDR) CIKM2013 29

Parameter Sensitivity Analysis • Performance in expert users recommendation of CQARank by varying the number of expertise ( 𝐹 ) and topics ( 𝐿 ) CIKM2013 30

CQARank:Jointly Model Topics and Expertise in Community Question - PowerPoint PPT Presentation

CQARank:Jointly Model Topics and Expertise in Community Question Answering Liu Yang, Minghui Qiu, Swapna Gottipati, Feida Zhu, Jing Jiang, Huiping Sun, Zhong Chen Peking University Singapore Management University Community Question Answering

COMMUNITY MANAGEMENT jono bacon COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY

Jointly and the Jointly ecosystem Madeleine Starr Director of Business Development and

Triple M Innovative Expertise Innovative Expertise Expertise in Automated Valves

Aviation Expertise Marketplace Innovative Pathway to Aviation Expertise December 2016 Aviation

Indicative Rating. . www.arcratings.com LOCAL EXPERTISE, SHARED INSIGHT LOCAL EXPERTISE, SHARED

Abelian returns in Sturmian words S. Puzynina jointly with L. Q. Zamboni S. Puzynina jointly

AGUERIS SYNTHETIC TRAINING SOLUTIONS AGUERIS COMPANY AND EXPERTISE Company and expertise

Expertise in animal Nutrition and Husbandry Expertise in nutrition and husbandry Pigs farmers

Company Presentation Healthcare & Pharma Expertise Areas of Expertise PROJECT PORTFOLIO 2015

Sustained Performance PPI Conference Nashville, TN October 17, 2018 JD Consulting Key Points of

Searching for Expertise Toine Bogers Royal School of Library & Information Science

Affine space fibrations . . . . . M. Miyanishi jointly with R.V. Gurjar and K. Masuda

Jointly Optimal Routing & Caching for Arbitrary Network Topologies Stratis Ioannidis and

Managing Capacity of New Regimes jointly working with Background Constant flow of new

Cosmological model : Cosmological model Cosmological model Cosmological model : : : :

Sphactor actor model concurrency for creatives expertise centre creative technology Background

Selective W eb Archiving at the Germ an National Library 1 | 8 | Selective Web Archiving

FROM PROVIDER TO PARTNER: THE CHANGING ROLE OF LIBRARIES AND DATA MINING A BIG DATA VIEW

CS 241: Systems Programming Lecture 29. Static Libraries Fall 2019 Prof. Stephen Checkoway 1

Fine-grained Image Recognition Lei Wang VILA group School of Computing and Information

Welcome & landscape David L Miller & Jason J Roberts Welcome! Who are we? David L

Thank you Anne, and Good _____________ everyone and thank you for joining us today. Today

EE E6882 SVIA Lecture # 1 Introduction, Course Syllabus Readings (available on course site)

Open Source Tools for Mining and Analysing Web Data @ Scale Kris Carpenter Negulescu, Internet

CQARank:Jointly Model Topics and Expertise in Community Question - PowerPoint PPT Presentation

CQARank:Jointly Model Topics and Expertise in Community Question Answering Liu Yang, Minghui Qiu, Swapna Gottipati, Feida Zhu, Jing Jiang, Huiping Sun, Zhong Chen Peking University Singapore Management University Community Question Answering

COMMUNITY MANAGEMENT jono bacon COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY COMMUNITY

Jointly and the Jointly ecosystem Madeleine Starr Director of Business Development and

Triple M Innovative Expertise Innovative Expertise Expertise in Automated Valves

Aviation Expertise Marketplace Innovative Pathway to Aviation Expertise December 2016 Aviation

Indicative Rating. . www.arcratings.com LOCAL EXPERTISE, SHARED INSIGHT LOCAL EXPERTISE, SHARED

Abelian returns in Sturmian words S. Puzynina jointly with L. Q. Zamboni S. Puzynina jointly

AGUERIS SYNTHETIC TRAINING SOLUTIONS AGUERIS COMPANY AND EXPERTISE Company and expertise

Expertise in animal Nutrition and Husbandry Expertise in nutrition and husbandry Pigs farmers

Company Presentation Healthcare &amp; Pharma Expertise Areas of Expertise PROJECT PORTFOLIO 2015

Sustained Performance PPI Conference Nashville, TN October 17, 2018 JD Consulting Key Points of

Searching for Expertise Toine Bogers Royal School of Library &amp; Information Science

Affine space fibrations . . . . . M. Miyanishi jointly with R.V. Gurjar and K. Masuda

Jointly Optimal Routing &amp; Caching for Arbitrary Network Topologies Stratis Ioannidis and

Managing Capacity of New Regimes jointly working with Background Constant flow of new

Cosmological model : Cosmological model Cosmological model Cosmological model : : : :

Sphactor actor model concurrency for creatives expertise centre creative technology Background

Selective W eb Archiving at the Germ an National Library 1 | 8 | Selective Web Archiving

FROM PROVIDER TO PARTNER: THE CHANGING ROLE OF LIBRARIES AND DATA MINING A BIG DATA VIEW

CS 241: Systems Programming Lecture 29. Static Libraries Fall 2019 Prof. Stephen Checkoway 1

Fine-grained Image Recognition Lei Wang VILA group School of Computing and Information

Welcome &amp; landscape David L Miller &amp; Jason J Roberts Welcome! Who are we? David L

Thank you Anne, and Good _____________ everyone and thank you for joining us today. Today

EE E6882 SVIA Lecture # 1 Introduction, Course Syllabus Readings (available on course site)

Open Source Tools for Mining and Analysing Web Data @ Scale Kris Carpenter Negulescu, Internet

Company Presentation Healthcare & Pharma Expertise Areas of Expertise PROJECT PORTFOLIO 2015

Searching for Expertise Toine Bogers Royal School of Library & Information Science

Jointly Optimal Routing & Caching for Arbitrary Network Topologies Stratis Ioannidis and

Welcome & landscape David L Miller & Jason J Roberts Welcome! Who are we? David L