improving entity recommendation with search log and multi
play

Improving Entity Recommendation with Search Log and Multi-Task - PowerPoint PPT Presentation

Improving Entity Recommendation with Search Log and Multi-Task Learning Jizhou Huang , Wei Zhang, Yaming Sun, HaifengWang, Ting Liu Outline Motivation Approach Experiment Problem Context-insensitive recommendations Context-aware


  1. Improving Entity Recommendation with Search Log and Multi-Task Learning Jizhou Huang , Wei Zhang, Yaming Sun, HaifengWang, Ting Liu

  2. Outline • Motivation • Approach • Experiment

  3. Problem Context-insensitive recommendations Context-aware recommendations * ⇒ ... ⇒ Chicago Dreamgirls ⇒ Chicago Context-aware entity recommendations are more relevant to a user’s information need

  4. Task • Context-aware entity recommendation – Given a query 𝑟 # , its context 𝐷 # = 𝑟 & , 𝑟 ( , … , 𝑟 #*& , and a set of related entities 𝐹 # = 𝑓 & , 𝑓 ( , … , 𝑓 - , our task is to rank the entities in 𝐹 # based on the signals derived from both 𝑟 # and 𝐷 # • Examples 𝐷 # ⇒ 𝑟 # Entity Recommendations Los Angeles travel guide ⇒ New York City, California, San Francisco, Illinois Chicago American rock band ⇒ Chicago The Doobie Brothers, The Beach Boys, Eagles, Cheap Trick Dreamgirls movie trailers ⇒ Moulin Rouge, Cabaret, The Jazz Singer, Roxie Hart Chicago

  5. Challenges • Imbalanced entity click logs for ambiguous queries – Recommendations cannot cover as many intents as possible – Sufficient for the frequently asked meanings of such queries – Insufficient for the rarely asked meanings of such queries • There may be irrelevant in-session preceding queries – Not every preceding query addresses the same information need as the current query

  6. Outline • Motivation • Approach • Experiment

  7. Improved with Search Log and Multi-Task Learning • Method – We propose a multi-task DNN model to combine two tasks of entity recommendation (main task) and context-aware ranking (auxiliary task) • Key intuitions – The two tasks are closely related in Web search and the representations of input queries and contexts can be naturally shared across them – We can take advantage of the large amounts of search logs in a multi-task learning framework to improve entity recommendation – The clicked documents are helpful in understanding users’ search intents behind a query under variant contexts, which can be beneficial to entity recommendation in a multi-task learning framework

  8. Multi-Task DNN Model Query and context representations shared across two tasks Context-aware ranking for Web Search Context-aware entity recommendation v s P( d i |c, q t ) P( d j |c, q t ) P( e k |c, q t ) P( e l |c, q t ) FC layer cos( v i , v r ) cos( v j , v r ) cos( v k , v m ) cos( v l , v m ) Concatenation v c v i v j v r v m v k v l Attention-based weighted average FC layer FC layer FC layer FC layer FC layer FC layer v s v q ... BiLSTM BiLSTM BiLSTM BiLSTM BiLSTM BiLSTM q 1 q t d i d j e k e l ... q 2 q t- 1 Documents Entities Preceding queries (context, c ) Current query

  9. Training • Objective – Minimize the negative log likelihood of the clicked results p 𝑠 ; 𝑑, 𝑟 −log 2 3,4,5 6 ∈8 • Training algorithm 1: Initialize model Θ randomly 2: for 𝑗𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜 in 1 ··· 𝑂 do 3: Randomly select a task 𝑈 (context-aware ranking or entity recommendation) 4: Select a random training example for task 𝑈 5: Compute loss for task 𝑈 6: Compute gradient ∇(Θ) 7: Update Θ by taking a gradient step with ∇(Θ) 8: end for

  10. Outline • Motivation • Approach • Experiment

  11. Entity Ranking • Ranking – Use the trained model to compute a score between 𝑑 , 𝑟 and 𝑓 ∈ 𝐹 P QR P S P 𝑓 𝑑, 𝑟 = cos 𝑤 N , 𝑤 O = ∥P Q ∥∥P S ∥ • Two ways of using the score to rank entities – As an individual ranking model – As a feature in a baseline learning to rank framework

  12. As an Individual Ranking Model MBR This method is based on nearest neighbors collaborative filtering proposed by [Fernandez-Tobias and Blanco, 2016] ER This is a context-insensitive DNN model which only considers the current query in generating entity recommendations ER-C This is a single-task DNN model which only uses the entity click logs to train an entity recommendation model ER-C-MT This is the proposed multi-task DNN model [Fernandez-Tobias and Blanco, 2016] Memory-based recommendations of entities for web search users. 2016. In CIKM.

  13. As a Feature in a Learning to Rank Framework LTR This baseline is a context-insensitive model comprising a set of non-contextual features for entity recommendation LTR-ER This model is trained with all LTR features and the similarity feature computed by ER LTR-ER-C This model is trained with all LTR features and the similarity feature computed by ER-C LTR-ER-C-MT This model is trained with all LTR features and the similarity feature computed by ER-C-MT

  14. Data & Evaluation Metric • Training – Context-aware ranking: 26,426,495 examples of (𝐷 U , 𝑟 U , 𝐸 U ) ; , 𝑒 X * • 𝐸 U = 𝑒 U XY&,…,Z , a clicked doc and 𝐿 randomly-sampled non-clicked docs – Entity recommendation: 8,821,550 examples of (𝐷 \ , 𝑟 \ , 𝐹 \ ) ; , 𝑓 ] * • 𝐹 \ = 𝑓 ]Y&,…,^ , a clicked entity and 𝑀 randomly-sampled non-clicked entities \ • Test – 8,402,881 examples of (𝐷 O , 𝑟 O , 𝐹 O ) • Evaluation metric – NDCG

  15. As an Individual Ranking Model • ER-C-MT vs. Baselines 0.0710 – Highest performance 0.08 0.0675 0.0663 – The most effective in ranking 0.07 0.0641 0.0504 entities for this task 0.06 0.0455 • ER-C-MT and ER-C vs. ER 0.0454 0.05 0.0444 – Both outperform ER 0.04 0.0216 – Preceding queries are useful 0.03 0.0206 for improving entity 0.0203 0.0194 0.02 recommendation • ER-C-MT vs. ER-C 0.01 0 – ER-C-MT outperforms ER-C NDCG@1 NDCG@5 NDCG@10 – Learning the model in a multi- task learning setting can bring MBR ER ER-C ER-C-MT further improvements

  16. As a Feature in a Learning to Rank Framework • LTR and LTR-ER vs. LTR- 0.2834 0.2728 ER-C and LTR-ER-C-MT 0.3 0.2438 0.2665 0.2324 0.2502 – The latter two outperform the 0.2261 0.25 former two 0.2103 – Context information can 0.2 0.1461 significantly help to improve 0.1386 the performance of entity 0.15 0.1332 0.1219 recommendation 0.1 • LTR-ER-C-MT vs. Others 0.05 – Highest performance – The performance of entity 0 recommendation can be NDCG@1 NDCG@5 NDCG@10 significantly improved through LTR LTR-ER LTR-ER-C LTR-ER-C-MT search logs and multi-task learning

  17. Examples Query and Context Query : A Song of Ice and Fire Context : Maisie Williams ⇒ Rose Leslie Entity Recommendations LTR : Westworld, Game of Thrones, House of Card, Nip/Tuck, Frozen LTR-ER-C-MT : Isaac Hempstead-Wright, Carice van Houten, Iwan Rheon, Liam Cunningham, Peter Dinklage Query and Context Query : Florence Context : Soccer Players ⇒ Roberto Baggio Entity Recommendations LTR : Vatican City, Pompeii, Rome, Metropolitan City of Florence, San Gimignano LTR-ER-C-MT : A.C. Milan, A.S. Roma, Inter Milan, A.C. ChievoVerona, Real Madrid C.F.

  18. Conclusions • We study the problem of context-aware entity recommendation – We propose a multi-task DNN model by leveraging Web search logs to improve entity recommendation – We evaluate our approach using large-scale, real-world search logs of a widely used commercial Web search engine • Our proposed method is effective for this task – The experiments show that our method significantly outperforms several strong baselines – The experiments also demonstrate that context information can significantly improve the performance of entity recommendation

  19. Thank you! huangjizhou@baidu.com

  20. 百度地图AI算法团队实习生招聘 岗位: 算法实习生 方向: 数据挖掘、自然语言处理、知识图谱 通过实习,你可以: 1)基于真实的时空大数据对AI算法进行深入研究,发表顶级会议论文 2)了解并实践如何利用百度地图的时空大数据以及百度AIG前沿技术,构建AI驱动的创新产品, 革新用户体验 3)得到具有丰富学术和工程经验的百度技术专家的多对一指导(学术导师+工程实践导师) 工作地点: 北京 实习时间: 6个月以上 基础要求: 有数据挖掘、自然语言处理、知识图谱等任一研究领域经验 感兴趣的同学请发送简历以及求职邮件到:huangjizhou@baidu.com

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend