Topic Modeling using Topics from Many Domains, Lifelong Learning and - - PowerPoint PPT Presentation

topic modeling using
SMART_READER_LITE
LIVE PREVIEW

Topic Modeling using Topics from Many Domains, Lifelong Learning and - - PowerPoint PPT Presentation

Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data Zhiyuan Chen and Bing Liu University Of Illinois at Chicago liub@cs.uic.edu Introduction Topic models, such as LDA (Blei et al., 2003) , pLSA (Hofmann, 1999) and


slide-1
SLIDE 1

Topic Modeling using Topics from Many Domains,

Lifelong Learning and Big Data

Zhiyuan Chen and Bing Liu University Of Illinois at Chicago liub@cs.uic.edu

slide-2
SLIDE 2

ICML-2014, Beijing, June 22-24, 2014 2

Introduction

 Topic models, such as LDA (Blei et al., 2003),

pLSA (Hofmann, 1999) and their variants

 Widely used to discover topics in docs

 Unsupervised models are often insufficient

 because their objective functions may not correlate

well with human judgments (Chang et al., 2009).

 Knowledge-based topic models (KBTM) better

(Andrzejewski et al., 2009 Mukherjee & Liu, 2012, etc)

 But not automatic  need user-given prior knowledge for each domain

slide-3
SLIDE 3

How to Improve Further?

 We can invent better topic models  But how about: Learn like humans

 What we learn in the past helps future learning  Whenever we see a new situation, we almost know

it already;

 few aspects are really new.  It shares a lot of things with what we’ve seen in the past.

 (a systems approach)

ICML-2014, Beijing, June 22-24, 2014 3

slide-4
SLIDE 4

Take a Major Step Forward

 Knowledge-based modeling is still traditional

 Knowledge provided by user and assumed correct.  Not automatic (each domain needs new knowledge

from user)

 Question: Can we mine prior knowledge

systematically and automatically?

 Answer: Yes - with big data (many domains)  Implication:

 Learn forever: past learning results help future learning  lifelong learning

ICML-2014, Beijing, June 22-24, 2014 4

slide-5
SLIDE 5

Why? (an example from opinion

mining)

 Topic overlap across domains: Although every

domain is different, there is a fair amount of topic

  • verlapping across domains, e.g.,

 Every product review domain has the topic price,  Most electronic products share the topic battery  Some products share the topic screen.

 If we have good topics from a large number of

past domain collections (big data):

 for a new collection, we can use existing topics

 To generate high quality prior knowledge automatically

ICML-2014, Beijing, June 22-24, 2014 5

slide-6
SLIDE 6

An example

 We have reviews from 3 domains and each

domain gives a topic about price.

 Domain 1: {price, color, cost, life}  Domain 2: {cost, picture, price, expensive}  Domain 3: {price, money, customer, expensive}

 Mining quality knowledge: require words to

appear in at least two domains. We get:

 {price, cost} and {price, expensive}.  Each set is likely to belong to the same topic.

ICML-2014, Beijing, June 22-24, 2014 6

slide-7
SLIDE 7

Run a KBTM: an Example (cont.)

 If we run a KBtM on reviews of Domain 1, we

may find the new topic about price:

 {price, cost, expensive, color},

 We get 3 coherent words in top 4 positions,

rather than only 2 words as in the old topic.

 Old: {price, color, cost, life}

 A good topic improvement

ICML-2014, Beijing, June 22-24, 2014 7

slide-8
SLIDE 8

Problem Statement

 Given a large set of document collections

𝐸 = {𝐸1, … , 𝐸𝑜}, learn from D to produce results S.

 Goal: Given a test collection 𝐸𝑢, learn from

𝐸𝑢 with the help of S (and possibly D).

 𝐸𝑢  D or 𝐸𝑢  D.  The results learned this way should be better than

without the guidance of S (and D).

ICML-2014, Beijing, June 22-24, 2014 8

slide-9
SLIDE 9

LTM – Lifelong learning Topic Model

 Cold start (initialization)

 Run LDA on each Di  D => topics Si

 S = U Si

 Given a new domain collection Dt

 Run LDA on Dt => topics At  Find matching topics Mj from S for each topic aj  At  Mine knowledge kj from each Mj

 Kt = U kj

 Run a KBTM on Dt with the help of Kt => new At

 KBTM uses Kt and also deals with wrong knowledge in Kt

 Update S with At

ICML-2014, Beijing, June 22-24, 2014 9

slide-10
SLIDE 10

Prior Topic Generation (cold start)

 Runs LDA on each 𝐸𝑗 ∈ 𝐸 to produce a set of

topics 𝑇𝑗 called prior topics (or p-topics).

ICML-2014, Beijing, June 22-24, 2014 10

slide-11
SLIDE 11

LTM Topic Model

 (1) Mine prior knowledge (pk-sets) (2) use prior

knowledge to guide modeling.

ICML-2014, Beijing, June 22-24, 2014 11

slide-12
SLIDE 12

Knowledge Mining Function

 Topic match: find similar topics (𝑁

𝑘∗ 𝑢 ) from p-topics

for each current topic

 Pattern mining: find frequent itemsets from 𝑁

𝑘∗ 𝑢

ICML-2014, Beijing, June 22-24, 2014 12

slide-13
SLIDE 13

Model inference: Gibbs sampling

 How to use prior knowledge (pk-sets)?

 e.g., {price, cost} & {price, expensive}

 How to tackle wrong knowledge?  Graphical model: same as LDA  But the model inference is different

 Generalized Pólya Urn Model (GPU) (Mimno et al., 2011)

 Idea: When assigning a topic t to a word w, also

assign a fraction of t to words in prior knowledge sets (pk-sets) sharing with w.

ICML-2014, Beijing, June 22-24, 2014 13

slide-14
SLIDE 14

Dealing with Wrong Knowledge

 Some pieces of automatically generated

knowledge (pk-sets) may be wrong.

 Deal with them in sampling (decide fraction).

 ensure words in a pk-set {𝑥, 𝑥′} are associated

 𝔹𝑥′,𝑥 =

1 𝑥 = 𝑥′ 𝜈 × 𝑄𝑁𝐽 𝑥, 𝑥′ {𝑥, 𝑥′}

  • therwise

is a pk-set

 𝑄𝑁𝐽 𝑥, 𝑥′ = log

𝑄(𝑥,𝑥′) 𝑄(𝑥)𝑄(𝑥′)

ICML-2014, Beijing, June 22-24, 2014 14

slide-15
SLIDE 15

Gibbs Sampler

 𝑄 𝑨𝑗 = 𝑢 𝒜−𝑗, 𝒙, 𝛽, 𝛾 ∝

𝑜𝑛,𝑢

−𝑗 + 𝛽

𝑢′=1

𝑈

𝑜𝑛,𝑢′

−𝑗

+ 𝛽 × 𝑥′=1

𝑊

𝔹𝑥′,𝑥𝑗 × 𝑜𝑢,𝑥′

−𝑗

+𝛾 𝑤=1

𝑊

𝑥′=1

𝑊

𝔹𝑥′,𝑤 × 𝑜𝑢,𝑥′

−𝑗

+𝛾

ICML-2014, Beijing, June 22-24, 2014 15

slide-16
SLIDE 16

Evaluation

 We used review collections D from 50 domains.

 Each domain has 1000 reviews.  Four domains with 10000 reviews for large data test.

 Test settings: Two test settings to evaluate LTM,

representing two possible uses of LTM

 seen the test domain before, i.e., 𝐸𝑢  D.  Not seen test domain before, i.e., 𝐸𝑢  D.

ICML-2014, Beijing, June 22-24, 2014 16

slide-17
SLIDE 17

Topic Coherence (Mimno et al., EMNLP-2011)

ICML-2014, Beijing, June 22-24, 2014 17

slide-18
SLIDE 18

Topic Coherence on 4 Large Datasets

 Can LTM improve with larger data?

ICML-2014, Beijing, June 22-24, 2014 18

slide-19
SLIDE 19

Split a large data to 10 smaller

  • nes

 Here we use only one domain data  Better topic coherence and better efficiency (30%)

ICML-2014, Beijing, June 22-24, 2014 19

slide-20
SLIDE 20

Summary

 Proposed a lifelong learning topic model LTM.  It keeps a large topic base S  For each new topic modeling task,

 run LDA to generate a set of initial topics  find matching old topics from S  mine quality knowledge from the old topics  use the knowledge to help generate better topics

 With big data (from diverse domains): we can

 do what we cannot do or haven’t done before.

ICML-2014, Beijing, June 22-24, 2014 20