Never-Ending Language Learning Tom Mitchell, William Cohen, and - - PowerPoint PPT Presentation

never ending language
SMART_READER_LITE
LIVE PREVIEW

Never-Ending Language Learning Tom Mitchell, William Cohen, and - - PowerPoint PPT Presentation

Never-Ending Language Learning Tom Mitchell, William Cohen, and Many Collaborators Carnegie Mellon University Key Idea 1: Coupled semi-supervised training of many functions Dinesh R person noun phrase hard much easier (more constrained)


slide-1
SLIDE 1

Never-Ending Language Learning

Tom Mitchell, William Cohen, and Many Collaborators Carnegie Mellon University

slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5

hard (underconstrained) semi-supervised learning problem

Key Idea 1: Coupled semi-supervised training

  • f many functions

much easier (more constrained) semi-supervised learning problem

person

noun phrase

Dinesh R

slide-6
SLIDE 6

NP:

person

Type 1 Coupling: Co-Training, Multi-View Learning

Supervised training of 1 function: Minimize:

slide-7
SLIDE 7

NP:

person

Type 1 Coupling: Co-Training, Multi-View Learning

Coupled training of 2 functions: Minimize:

Anshul

slide-8
SLIDE 8

NP:

person

Type 1 Coupling: Co-Training, Multi-View Learning

[Blum & Mitchell; 98] [Dasgupta et al; 01 ] [Ganchev et al., 08] [Sridharan & Kakade, 08] [Wang & Zhou, ICML10]

slide-9
SLIDE 9

NP:

person

Type 1 Coupling: Co-Training, Multi-View Learning

[Blum & Mitchell; 98] [Dasgupta et al; 01 ] [Ganchev et al., 08] [Sridharan & Kakade, 08] [Wang & Zhou, ICML10]

slide-10
SLIDE 10

team person

NP:

athlete coach sport

NP text context distribution NP morphology NP HTML contexts

Multi-view, Multi-Task Coupling

[Blum & Mitchell; 98] [Dasgupta et al; 01 ] [Ganchev et al., 08] [Sridharan & Kakade, 08] [Wang & Zhou, ICML10] athlete(NP) ! person(NP) athlete(NP) ! NOT sport(NP) NOT athlete(NP) " sport(NP) [Taskar et al., 2009] [Carlson et al., 2009] Rishab

slide-11
SLIDE 11

coachesTeam(c,t) playsForTeam(a,t) teamPlaysSport(t,s) playsSport(a,s) NP1 NP2

Type 3 Coupling: Relation Argument Types

slide-12
SLIDE 12

team coachesTeam(c,t) playsForTeam(a,t) teamPlaysSport(t,s) playsSport(a,s) person NP1 athlete coach sport team person NP2 athlete coach sport

playsSport(NP1,NP2) ! athlete(NP1), sport(NP2)

Type 3 Coupling: Relation Argument Types

  • ver 2500 coupled

functions in NELL

Happy Dinesh K.

slide-13
SLIDE 13

If coupled learning is the key, how can we get new coupling constraints?

slide-14
SLIDE 14

Key Idea 2: Discover New Coupling Constraints

  • learn horn clause rules/constraints:

– learned by data mining the knowledge base – connect previously uncoupled relation predicates – infer new unread beliefs – modified version of FOIL [Quinlan]

0.93 athletePlaysSport(?x,?y) " athletePlaysForTeam(?x,?z) teamPlaysSport(?z,?y) Barun

slide-15
SLIDE 15

Key Idea 3: Automatically extend ontology

slide-16
SLIDE 16

Dinesh R Surag Ankit Barun Dhruvin Dinesh R.: only 62 new

  • ntologies added
slide-17
SLIDE 17

Example Discovered Relations

Category Pair Frequent Instance Pairs Text Contexts Suggested Name MusicInstrument Musician sitar, George Harrison tenor sax, Stan Getz trombone, Tommy Dorsey vibes, Lionel Hampton ARG1 master ARG2 ARG1 virtuoso ARG2 ARG1 legend ARG2 ARG2 plays ARG1 Master Disease Disease pinched nerve, herniated disk tennis elbow, tendonitis blepharospasm, dystonia ARG1 is due to ARG2 ARG1 is caused by ARG2 IsDueTo CellType Chemical epithelial cells, surfactant neurons, serotonin mast cells, histomine ARG1 that release ARG2 ARG2 releasing ARG1 ThatRelease Mammals Plant koala bears, eucalyptus sheep, grasses goats, saplings ARG1 eat ARG2 ARG2 eating ARG1 Eat River City Seine, Paris Nile, Cairo Tiber river, Rome ARG1 in heart of ARG2 ARG1 which flows through ARG2 InHeartOf

[Mohamed et al. EMNLP 2011]

slide-18
SLIDE 18
slide-19
SLIDE 19

NELL Architecture

Knowledge Base (latent variables) Text Context patterns (CPL) Orthographic classifier (CML) Beliefs Candidate Beliefs Evidence Integrator Human advice Actively search for web text (OpenEval) Infer new beliefs from

  • ld

(PRA) Image classifier (NEIL) Ontology extender (OntExt) URL specific HTML patterns (SEAL)

slide-20
SLIDE 20

Haroun

slide-21
SLIDE 21

Evaluation

slide-22
SLIDE 22

NELL Is Improving Over Time (Jan 2010 to Nov 2014)

number of NELL beliefs vs. time

all beliefs high conf. beliefs 10’s of millions millions

reading accuracy vs. time (average over 31 predicates)

precision@10 mean avg. precision top 1000

human feedback vs. time (average 2.4 feedbacks per predicate per month)

slide-23
SLIDE 23

Limitations

  • Self reflection and an explicit agenda of learning sub- goals: NELL

suffers from the fact that it has a very weak ability to monitor its own performance and progress

  • Pervasive plasticity : NELL’s method for detecting noun phrases in

text is a fixed procedure not open to learning and hence it runs the risk of reaching a performance plateau

  • Representation and reasoning: lacks methods for representing and

reasoning about time and space

  • Heavy reliance on the redundancy across the web: NELL’s

redundancy-based reading methods tend to extract the most frequently-mentioned beliefs earlier.

slide-24
SLIDE 24

Other Limitations/Possible Improvements

  • No framework for forgetting previously learnt wrong relations

[Anshul, Swarandeep]

  • Extension beyond simple horn clauses [Anshul, Ankit]
  • Evaluation on the tail of the distribution [Happy]
  • categorizing phrases/sentences into sarcastic/rhetorical questions

[Happy]

  • Can NELL learn more and more new complex algorithms from simple

algorithms [Ankit]

slide-25
SLIDE 25

Other Limitations/Possible Improvements

  • Incorporating degrees of truth, variation of truth with time, and fuzzy

categories [Anshul]

  • word sense disambiguation module [Surag]
  • Reading over evolving domains such as twitter [Dinesh K]
slide-26
SLIDE 26

Consistency and correctness

slide-27
SLIDE 27

Problem setting:

  • have N different estimates of target function
  • agreement between fi, fj :

Key insight: errors and agreement rates are related

[Platanios, Blum, Mitchell, UAI 2014]

Pr[neither makes error] + Pr[both make error]

  • prob. fi and fi

agree

  • prob. fi

error

  • prob. fj

error

  • prob. fi and fj

both make error

slide-28
SLIDE 28

Estimating Error from Unlabeled Data

  • 1. IF f1 , f2 , f3 make indep. errors, and accuracies > 0.5

THEN ! Measure errors from unlabeled data:

  • use unlabeled data to estimate a12, a13, a23
  • solve three equations for three unknowns e1, e2, e3
slide-29
SLIDE 29

Estimating Error from Unlabeled Data

  • 1. IF f1 , f2 , f3 make indep. errors, accuracies > 0.5

THEN !

  • 2. but if errors not independent