Never-Ending Language Learning Tom Mitchell, William Cohen, and Many - PowerPoint PPT Presentation

Never-Ending Language Learning Tom Mitchell, William Cohen, and Many Collaborators Carnegie Mellon University

We will never really understand learning until we build machines that • learn many different things, • from years of diverse experience, • in a staged, curricular fashion, • and become better learners over time.

Tenet 2: Natural language understanding requires a belief system A natural language understanding system should react to text by saying either: • I understand, and already knew that • I understand, and didn’t know, but accept it • I understand, and disagree because …

NELL: Never-Ending Language Learner Inputs: • initial ontology (categories and relations) • dozen examples of each ontology predicate • the web • occasional interaction with human trainers The task: • run 24x7, forever • each day: 1. extract more facts from the web to populate the ontology 2. learn to read (perform #1) better than yesterday

NELL today Running 24x7, since January, 12, 2010 Result: • knowledge base with 90 million candidate beliefs • learning to read • learning to reason • extending ontology

NELL knowledge fragment football uses * including only correct beliefs equipment climbing skates helmet Canada Sunnybrook Miller uses equipment city hospital Wilson company country hockey Detroit GM politician CFRB radio Pearson Toronto hometown play hired competes airport home town with Stanley city Maple Leafs Red Cup company city won won Wings Toyota stadium team stadium league league Connaught city acquired paper city Air Canada NHL member created stadium Hino Centre plays in economic sector Globe and Mail Sundin Prius writer automobile Toskala Skydome Corrola Milson

NELL Is Improving Over Time (Jan 2010 to Nov 2014) mean avg. precision top 1000 precision@10 all beliefs high conf. beliefs 10’s of millions millions number of NELL beliefs vs. time reading accuracy vs. time (average over 31 predicates) human feedback vs. time (average 2.4 feedbacks per predicate per month)

NELL Today • eg. “diabetes”, “Avandia”, “ tea ” , “ IBM ” , “ love ” “baseball” “San Juan” “BacteriaCausesCondition” “kitchenItem” “ClothingGoesWithClothing” …

[Estevam Hruschka, 2014] Portuguese NELL

How does NELL work?

Semi-Supervised Bootstrap Learning Learn which it ’ s underconstrained!! noun phrases are cities: San Francisco anxiety Paris Berlin selfishness Pittsburgh denial London Seattle Montpelier mayor of arg1 arg1 is home of live in arg1 traits such as arg1

Key Idea 1: Coupled semi-supervised training of many functions person noun phrase hard much easier (more constrained) (underconstrained) semi-supervised learning problem semi-supervised learning problem

Type 1 Coupling: Co-Training, Multi-View Learning Supervised training of 1 function : Minimize: person NP :

Type 1 Coupling: Co-Training, Multi-View Learning Coupled training of 2 functions : Minimize: person NP :

Type 1 Coupling: Co-Training, Multi-View Learning [Blum & Mitchell; 98] [Dasgupta et al; 01 ] [Ganchev et al., 08] [Sridharan & Kakade, 08] person [Wang & Zhou, ICML10] NP :

NELL: Learned reading strategies Mountain: "volcanic crater of _" "volcanic eruptions like _" "volcanic peak of _" "volcanic region of _" "volcano , called _" "volcano called _" "volcano is called _" "volcano known as _" "volcano Mt _" "volcano named _" "volcanoes , including _" "volcanoes , like _" "volcanoes , such as _" "volcanoes include _" "volcanoes including _" "volcanoes such as _" "We 've climbed _" "weather atop _" "weather station atop _" "week hiking in _" "weekend trip through _" "West face of _" "West ridge of _" "west to beyond _" "white ledge in _" "white summit of _" "whole earth , is _" "wilderness area surrounding _" "wilderness areas around _" "wind rent _" "winter ascent of _" "winter ascents in _" "winter ascents of _" "winter expedition to _" "wooded foothills of _" "world famous view of _" "world famous views of _" "you 're popping by _" "you 've just climbed _" "you just climbed _" "you’ve climbed _" "_ ' crater" "_ ' eruption" "_ ' foothills" "_ ' glaciers" "_ ' new dome" "_ 's Base Camp" "_ 's drug guide" "_ 's east rift zone" "_ 's main summit" "_ 's North Face" "_ 's North Peak" "_ 's North Ridge" "_ 's northern slopes" "_ 's southeast ridge" "_ 's summit caldera" "_ 's West Face" "_ 's West Ridge" "_ 's west ridge" "_ (D,DDD ft" ” "_ climbing permits" "_ climbing safari" "_ consult el diablo" "_ cooking planks" "_ dominates the sky line" "_ dominates the western skyline" "_ dominating the scenery”

Type 1 Coupling: Co-Training, Multi-View Learning [Blum & Mitchell; 98] [Dasgupta et al; 01 ] [Ganchev et al., 08] [Sridharan & Kakade, 08] person [Wang & Zhou, ICML10] NP :

Multi-view, Multi-Task Coupling [Blum & Mitchell; 98] [Dasgupta et al; 01 ] [Ganchev et al., 08] person [Sridharan & Kakade, 08] sport athlete [Wang & Zhou, ICML10] coach [Taskar et al., 2009] team [Carlson et al., 2009] NP text NP NP HTML NP : context morphology contexts distribution athlete(NP) à à person(NP) athlete(NP) à à NOT sport(NP) NOT athlete(NP) ß ß sport(NP)

Type 3 Coupling: Relation Argument Types playsSport(a,s) coachesTeam(c,t) playsForTeam(a,t) teamPlaysSport(t,s) NP1 NP2

Type 3 Coupling: Relation Argument Types playsSport(NP1,NP2) à à athlete(NP1), sport(NP2) playsSport(a,s) coachesTeam(c,t) playsForTeam(a,t) teamPlaysSport(t,s) person sport person sport athlete athlete team coach team coach over 2500 coupled functions in NELL NP1 NP2

Pure EM Approach to Coupled Training E: estimate labels for each function of each unlabeled example M: retrain all functions, using these probabilistic labels Scaling problem: • E step: 25M NP ’ s, 10 14 NP pairs to label • M step: 50M text contexts to consider for each function à 10 10 parameters to retrain • even more URL-HTML contexts …

NELL ’ s Approximation to EM E ’ step: • Re-estimate the knowledge base: – but consider only a growing subset of the latent variable assignments – category variables: up to 250 new NP ’ s per category per iteration – relation variables: add only if confident and args of correct type – this set of explicit latent assignments * IS* the knowledge base M’ step: • Each view-based learner retrains itself from the updated KB • “context” methods create growing subsets of contexts

Initial NELL Architecture Knowledge Base (latent variables) Beliefs Knowledge Integrator Candidate Beliefs Text HTML-URL Morphology Human Context context classifier advice patterns patterns (CPL) (SEAL) (CML) Continually Learning Reading Components

If coupled learning is the key, how can we get new coupling constraints?

Key Idea 2: Discover New Coupling Constraints • learn horn clause rules/constraints: 0.93 athletePlaysSport(?x,?y) ß athletePlaysForTeam(?x,?z) teamPlaysSport(?z,?y) – learned by data mining the knowledge base – connect previously uncoupled relation predicates – infer new unread beliefs – modified version of FOIL [Quinlan]

Learned Probabilistic Horn Clause Rules 0.93 playsSport(?x,?y) ß playsForTeam(?x,?z), teamPlaysSport(?z,?y) playsSport(a,s) coachesTeam(c,t) playsForTeam(a,t) teamPlaysSport(t,s) person sport person sport athlete athlete team coach team coach NP1 NP2

Infer New Beliefs [Lao, Mitchell, Cohen, EMNLP 2011] economic sector competes economic If: x1 x2 x3 with sector (x1,x2) (x2, x3) Then: economic sector (x1, x3)

Inference by Random Walks PRA: [Lao, Mitchell, Cohen, EMNLP 2011] economic sector PRA: 1. restrict precondition to a chain. competes economic If: x1 x2 x3 2. inference with sector by random (x1,x2) (x2, x3) walks Then: economic sector (x1, x3)

Inference by KB Random Walks [Lao, Mitchell, Cohen, EMNLP 2011] KB: Random walk competes economic path type: x ? y with sector Pr( R(x,y) ): logistic function for R(x,y) where i th feature = probability of arriving at node y starting at node x, and taking a random walk along path of type i

CityLocatedInCountry(Pittsburgh) = ? [Lao, Mitchell, Cohen, EMNLP 2011] Pittsburgh Logistic Regresssion Weight Feature = Typed Path Feature Value CityInState, CityInstate -1 , CityLocatedInCountry 0.32

CityLocatedInCountry(Pittsburgh) = ? [Lao, Mitchell, Cohen, EMNLP 2011] Pennsylvania Pittsburgh Logistic Regresssion Weight Feature = Typed Path Feature Value CityInState, CityInstate -1 , CityLocatedInCountry 0.32

Never-Ending Language Learning Tom Mitchell, William Cohen, and Many - PowerPoint PPT Presentation

Never-Ending Language Learning Tom Mitchell, William Cohen, and Many Collaborators Carnegie Mellon University We will never really understand learning until we build machines that learn many different things, from years of diverse

Never Ending Learning Tom Mitchell Machine Learning Department Carnegie Mellon University New

never done jalewis@thoughtworks.com @boicy 1 never done 2 never done Incomplete adjective

never done jalewis@thoughtworks.com @boicy 1 never done 2 never done Incomplete adjective

SLIDE 1 OF 25 IMMIGRATION BY SEA AN EPIC NEVER-ENDING STORY 1. THE ITALIAN COAST GUARD 2. AN EPIC

Moving Beyond Linearity The truth is never linear! 1 / 23 Moving Beyond Linearity The truth is

QUICK INTRODUCTION People call me GONZ QUICK INTRODUCTION 1. Never went to Art School

Outline Language learning Computers Computers Computers Topic 6: CALL Topic 6: CALL Topic 6:

Thesis: We will never really understand learning until we build machines that learn many

Never-Ending Language Learning Tom Mitchell, William Cohen, and Many Collaborators Carnegie

Never Ending Language Learning T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, J. Betteridge, A.

Toward an Architecture for Never-Ending Language Learning Andrew Carlson, Justin Betteridge,

Financial Report Financial Report For the Period Ending For the Period Ending August 31, 2019

Commonsense Reasoning: Knowledge Acquisition Never-Ending Language Learner (NELL) Contents

LEARNING NEVER STOPS Parent Presentation L.N.S. OVERVIEW Mission Statement: Vision Statement:

Models of Language Evolution models thereof its evolution language Models of Language Evolution

Toward Never-Ending Learning of Semantic Knowledge Justin Betteridge, Andrew Carlson, Estevam R.

Estimating and Simulating a SIRD Model of COVID-19 for Many Countries, States, and Cities

HPV vaccine: a critical component in a comprehensive cervical cancer prevention program Vivien

Temperature Sensitive Medical Products (TSMPs) Good Distribution Practices (GDP) Distribution

Vaccines: Past, Present, and Future Tapani Ronni, PhD 1 About the Speaker PhD in

Biomechanics Part of Ergonomics Etymology Ergon (): work Nomos ():

Lecture 09 Interactive Visualization and Visual Analytics a.holzinger@tugraz.at Tutor:

24 April 2018 Andy Wright, IAPT Advisor, Heather Stonebank, Lead PWP Advisor and Sarah Boul,

Personalized Documents in OpenCms Dr. Martin Abel General Manager Plambeck Health GmbH OpenCms

Sambuz

Useful Links

Newsletter

Mail Us