Housekeeping This presentation consists of slides and audio. If - - PowerPoint PPT Presentation

housekeeping
SMART_READER_LITE
LIVE PREVIEW

Housekeeping This presentation consists of slides and audio. If - - PowerPoint PPT Presentation

Housekeeping This presentation consists of slides and audio. If you are experiencing any problems/issues, please press the F5 key on your keyboard if youre using Windows , or Command + R if youre on a Mac , to refresh your console,


slide-1
SLIDE 1

“Housekeeping”

  • This presentation consists of slides and audio. If you are experiencing any problems/issues,

please press the F5 key on your keyboard if you’re using Windows, or Command + R if you’re

  • n a Mac, to refresh your console, or close and re-launch the presentation. You can also view

the Webcast Help Guide, by clicking on the “Help” widget in the bottom dock.

  • To control volume, adjust the master volume on your computer.
  • At the end of the presentation, you’ll see a survey URL on the final slide. Please take a minute

to click on the link and fill it out to help us improve your next webinar experience.

  • You can download a PDF of these slides by clicking on the Resources widget in the bottom

dock.

  • This presentation is being recorded and will be available for on-demand viewing in the next few
  • days. You will receive an automatic e-mail notification when the recording is ready.
  • If you think of a question during the presentation, please type it into the Q&A box and click on

the submit button. You do not need to wait until the end of the presentation to begin submitting

  • questions. You may also use the Q&A box (and the survey at the end) to suggest topics for

future webinars of interest to you.

slide-2
SLIDE 2
  • 1,350+ trusted technical books and videos by leading publishers including

O’Reilly, Morgan Kaufmann, others

  • Online courses with assessments and certification-track mentoring, member

discounts on tuition at partner institutions

  • Learning Webinars on big topics (Cloud Computing/Mobile Development,

Cybersecurity, Big Data, Recommender Systems, SaaS, Agile, Natural Language Processing)

  • ACM Tech Packs on top current computing topics: Annotated Bibliographies

compiled by subject experts

  • Popular video tutorials/keynotes from ACM Digital Library, A.M. Turing

Centenary talks/panels

  • Podcasts with industry leaders/award winners

ACM Learning Center

http://learning.acm.org

slide-3
SLIDE 3

Talk Back

  • Use the Facebook widget in the bottom panel to share this presentation

with friends and colleagues

  • Use Twitter widget to Tweet your favorite quotes from today’s presentation

with hashtag #ACMWebinarNELL

  • Submit questions and comments via Twitter to @acmeducation

– we’re reading them!

slide-4
SLIDE 4

Never Ending Language Learning

Tom Mitchell Machine Learning Department Carnegie Mellon University

slide-5
SLIDE 5

Tenet 1: We’ll never really understand learning until we build machines that

  • learn many different things,
  • over years,
  • and become better learners over time.
slide-6
SLIDE 6

Tenet 2: Natural language understanding requires a belief system A natural language understanding system should react to text by saying either:

  • I understand, and already knew that
  • I understand, and didn’t know, but accept it
  • I understand, and disagree because …
slide-7
SLIDE 7

NELL: Never-Ending Language Learner

Inputs:

  • initial ontology (categories and relations)
  • dozen examples of each ontology predicate
  • the web
  • occasional interaction with human trainers

The task:

  • run 24x7, forever
  • each day:

1. extract more facts from the web to populate the ontology 2. learn to read (perform #1) better than yesterday

slide-8
SLIDE 8

NELL today

Running 24x7, since January, 12, 2010 Result:

  • KB with > 50 million candidate beliefs, growing daily
  • learning to read better each day
  • learning to reason, as well as read
  • automatically extending its ontology
slide-9
SLIDE 9

Globe and Mail Stanley Cup hockey NHL Toronto CFRB Wilson play hired won Maple Leafs home town city paper league Sundin Milson writer radio Air Canada Centre team stadium Canada city stadium politician country Miller airport member Toskala Pearson Skydome Connaught Sunnybrook hospital city company skates helmet uses equipment won Red Wings Detroit hometown GM city company competes with Toyota plays in league Prius Corrola created Hino acquired automobile economic sector city stadium

NELL knowledge fragment

climbing football uses equipment

slide-10
SLIDE 10

NELL Today

  • eg. “diabetes”, “Avandia”, “tea”, “IBM”, “love” “baseball”

“BacteriaCausesCondition” “kitchenItem” “ClothingGoesWithClothing” …

slide-11
SLIDE 11

How does NELL work?

slide-12
SLIDE 12

Semi-Supervised Bootstrap Learning

Paris Pittsburgh Seattle Montpelier mayor of arg1 live in arg1 San Francisco Berlin denial arg1 is home of traits such as arg1 it’s underconstrained!! anxiety selfishness London

Learn which noun phrases are cities:

slide-13
SLIDE 13

hard (underconstrained) semi-supervised learning problem

Key Idea 1: Coupled semi-supervised training

  • f many functions

much easier (more constrained) semi-supervised learning problem

person

noun phrase

slide-14
SLIDE 14

NP:

person

Type 1 Coupling: Co-Training, Multi-View Learning

Supervised training of 1 function: Minimize:

slide-15
SLIDE 15

NP:

person

Type 1 Coupling: Co-Training, Multi-View Learning

Coupled training of 2 functions: Minimize:

slide-16
SLIDE 16

NP:

person

Type 1 Coupling: Co-Training, Multi-View Learning

[Blum & Mitchell; 98] [Dasgupta et al; 01 ] [Ganchev et al., 08] [Sridharan & Kakade, 08] [Wang & Zhou, ICML10]

slide-17
SLIDE 17

team person athlete coach sport

NP

athlete(NP)  person(NP) athlete(NP)  NOT sport(NP) NOT athlete(NP)  sport(NP)

Type 2 Coupling: Multi-task, Structured Outputs

[Daume, 2008] [Bakhir et al., eds. 2007] [Roth et al., 2008] [Taskar et al., 2009] [Carlson et al., 2009]

slide-18
SLIDE 18

team person

NP:

athlete coach sport

NP text context distribution NP morphology NP HTML contexts

Multi-view, Multi-Task Coupling

slide-19
SLIDE 19

coachesTeam(c,t) playsForTeam(a,t) teamPlaysSport(t,s) playsSport(a,s) NP1 NP2

Type 3 Coupling: Learning Relations

slide-20
SLIDE 20

team coachesTeam(c,t) playsForTeam(a,t) teamPlaysSport(t,s) playsSport(a,s) person NP1 athlete coach sport team person NP2 athlete coach sport

playsSport(NP1,NP2)  athlete(NP1), sport(NP2)

Type 3 Coupling: Argument Types

  • ver 2500 coupled

functions in NELL

slide-21
SLIDE 21

NELL: Learned reading strategies

Plays_Sport(arg1,arg2): arg1_was_playing_arg2 arg2_megastar_arg1 arg2_icons_arg1 arg2_player_named_arg1 arg2_prodigy_arg1 arg1_is_the_tiger_woods_of_arg2 arg2_career_of_arg1 arg2_greats_as_arg1 arg1_plays_arg2 arg2_player_is_arg1 arg2_legends_arg1 arg1_announced_his_retirement_from_arg2 arg2_operations_chief_arg1 arg2_player_like_arg1 arg2_and_golfing_personalities_including_arg1 arg2_players_like_arg1 arg2_greats_like_arg1 arg2_players_are_steffi_graf_and_arg1 arg2_great_arg1 arg2_champ_arg1 arg2_greats_such_as_arg1 arg2_professionals_such_as_arg1 arg2_hit_by_arg1 arg2_greats_arg1 arg2_icon_arg1 arg2_stars_like_arg1 arg2_pros_like_arg1 arg1_retires_from_arg2 arg2_phenom_arg1 arg2_lesson_from_arg1 arg2_architects_robert_trent_jones_and_arg1 arg2_sensation_arg1 arg2_pros_arg1 arg2_stars_venus_and_arg1 arg2_hall_of_famer_arg1 arg2_superstar_arg1 arg2_legend_arg1 arg2_legends_such_as_arg1 arg2_players_is_arg1 arg2_pro_arg1 arg2_player_was_arg1 arg2_god_arg1 arg2_idol_arg1 arg1_was_born_to_play_arg2 arg2_star_arg1 arg2_hero_arg1 arg2_players_are_arg1 arg1_retired_from_professional_arg2 arg2_legends_as_arg1 arg2_autographed_by_arg1 arg2_champion_arg1 …

slide-22
SLIDE 22

Continually Learning Extractors

Initial NELL Architecture

Knowledge Base (latent variables) Text Context patterns (CPL) HTML-URL context patterns (SEAL) Morphology classifier (CML) Beliefs Candidate Beliefs Evidence Integrator Human advice

slide-23
SLIDE 23

If coupled learning is the key, how can we get new coupling constraints?

slide-24
SLIDE 24

Key Idea 2: Discover New Coupling Constraints

  • first order, probabilistic horn clause constraints:

– learned by data mining the knowledge base – connect previously uncoupled relation predicates – infer new unread beliefs – modified version of FOIL [Quinlan]

0.93 athletePlaysSport(?x,?y)  athletePlaysForTeam(?x,?z) teamPlaysSport(?z,?y)

slide-25
SLIDE 25

Example Learned Horn Clauses

athletePlaysSport(?x,basketball)  athleteInLeague(?x,NBA) athletePlaysSport(?x,?y)  athletePlaysForTeam(?x,?z) teamPlaysSport(?z,?y) teamPlaysInLeague(?x,NHL)  teamWonTrophy(?x,Stanley_Cup) athleteInLeague(?x,?y) athletePlaysForTeam(?x,?z), teamPlaysInLeague(?z,?y) cityInState(?x,?y)  cityCapitalOfState(?x,?y), cityInCountry(?y,USA) newspaperInCity(?x,New_York)  companyEconomicSector(?x,media) generalizations(?x,blog) 0.95 0.93 0.91 0.90 0.88 0.62*

slide-26
SLIDE 26

team coachesTeam(c,t) playsForTeam(a,t) teamPlaysSport(t,s) playsSport(a,s) person NP1 athlete coach sport team person NP2 athlete coach sport

Learned Probabilistic Horn Clause Rules

0.93 playsSport(?x,?y)  playsForTeam(?x,?z), teamPlaysSport(?z,?y)

slide-27
SLIDE 27

Key Idea 3: Automatically extend ontology

slide-28
SLIDE 28

Ontology Extension (1)

Goal:

  • Add new relations to ontology

Approach:

  • For each pair of categories C1, C2,
  • cluster pairs of known instances, in terms of

text contexts that connect them

[Mohamed et al., EMNLP 2011]

slide-29
SLIDE 29

Example Discovered Relations

Category Pair Frequent Instance Pairs Text Contexts Suggested Name MusicInstrument Musician sitar, George Harrison tenor sax, Stan Getz trombone, Tommy Dorsey vibes, Lionel Hampton ARG1 master ARG2 ARG1 virtuoso ARG2 ARG1 legend ARG2 ARG2 plays ARG1 Master Disease Disease pinched nerve, herniated disk tennis elbow, tendonitis blepharospasm, dystonia ARG1 is due to ARG2 ARG1 is caused by ARG2 IsDueTo CellType Chemical epithelial cells, surfactant neurons, serotonin mast cells, histomine ARG1 that release ARG2 ARG2 releasing ARG1 ThatRelease Mammals Plant koala bears, eucalyptus sheep, grasses goats, saplings ARG1 eat ARG2 ARG2 eating ARG1 Eat River City Seine, Paris Nile, Cairo Tiber river, Rome ARG1 in heart of ARG2 ARG1 which flows through ARG2 InHeartOf

[Mohamed et al. EMNLP 2011]

slide-30
SLIDE 30

NELL: sample of self-added relations

  • athleteWonAward
  • animalEatsFood
  • languageTaughtInCity
  • clothingMadeFromPlant
  • beverageServedWithFood
  • fishServedWithFood
  • athleteBeatAthlete
  • athleteInjuredBodyPart
  • arthropodFeedsOnInsect
  • animalEatsVegetable
  • plantRepresentsEmotion
  • foodDecreasesRiskOfDisease
  • clothingGoesWithClothing
  • bacteriaCausesPhysCondition
  • buildingMadeOfMaterial
  • emotionAssociatedWithDisease
  • foodCanCauseDisease
  • agriculturalProductAttractsInsect
  • arteryArisesFromArtery
  • countryHasSportsFans
  • bakedGoodServedWithBeverage
  • beverageContainsProtein
  • animalCanDevelopDisease
  • beverageMadeFromBeverage
slide-31
SLIDE 31

Ontology Extension (2)

Goal:

  • Add new subcategories

Approach:

  • For each category C,
  • train NELL to read the relation

SubsetOfC: C  C

[Burr Settles]

*no new software here, just add this relation to

  • ntology
slide-32
SLIDE 32

NELL: example self-discovered subcategories

Animal:

  • Pets

– Hamsters, Ferrets, Birds, Dog, Cats, Rabbits, Snakes, Parrots, Kittens, …

  • Predators

– Bears, Foxes, Wolves, Coyotes, Snakes, Racoons, Eagles, Lions, Leopards, Hawks, Humans, …

Learned reading patterns for

"arg1 and other medium sized arg2" "arg1 and other jungle arg2” "arg1 and other magnificent arg2" "arg1 and other pesky arg2" "arg1 and other mammals and arg2" "arg1 and other Ice Age arg2" "arg1 or other biting arg2" "arg1 and other marsh arg2" "arg1 and other migrant arg2” "arg1 and other monogastric arg2" "arg1 and other mythical

AnimalSubset(arg1,arg2)

slide-33
SLIDE 33

NELL: example self-discovered subcategories

Animal:

  • Pets

– Hamsters, Ferrets, Birds, Dog, Cats, Rabbits, Snakes, Parrots, Kittens, …

  • Predators

– Bears, Foxes, Wolves, Coyotes, Snakes, Racoons, Eagles, Lions, Leopards, Hawks, Humans, …

Chemical:

  • Fossil fuels

– Carbon, Natural gas, Coal, Diesel, Monoxide, Gases, …

  • Gases

– Helium, Carbon dioxide, Methane, Oxygen, Propane, Ozone, Radon…

Learned reading patterns:

"arg1 and other medium sized arg2" "arg1 and other jungle arg2” "arg1 and other magnificent arg2" "arg1 and other pesky arg2" "arg1 and other mammals and arg2" "arg1 and other Ice Age arg2" "arg1 or other biting arg2" "arg1 and other marsh arg2" "arg1 and other migrant arg2” "arg1 and other monogastric arg2" "arg1 and other mythical

Learned reading patterns:

"arg1 and other hydrocarbon arg2” "arg1 and other aqueous arg2” "arg1 and other hazardous air arg2" "arg1 and oxygen are arg2” "arg1 and such synthetic arg2” "arg1 as a lifting arg2" "arg1 as a tracer arg2" "arg1 as the carrier arg2” "arg1 as the inert arg2" "arg1 as the primary cleaning arg2” "arg1 and

  • ther noxious arg2" "arg1 and other

trace arg2" "arg1 as the reagent

slide-34
SLIDE 34

Combine reading and clustering

Clustering NELL’s animals:

  • 1

rays; dolphin; roller; jacks; squid; snapper; big fish; game fish; whitehead; stripers; chinook; rudd; wahoo; flounder; redfish; sea bass; ..

  • 2

lobster; oysters; shellfish; mussels; clams; scallops; oyster; clam; abalone; crabmeat; mussel;

  • 3

snakes; turtles; turtle; crocodiles; alligators; endangered species; tortoises; gecko; iguanas; rattlesnake; vipers; rattlesnakes; anaconda; ...

  • 4

dog; dogs; horses; cats; costa rica; sheep; livestock; puppy; chickens; rabbit; duck; pests; goats; poultry; geek; bulldogs; farm animals; hen; bunnies; pit bulls; fowl; felines; poodle; pit bull; alpaca; …

  • 5

geese; swans; flamingos; loons; canada geese; teal; mallards; egrets; sandhill cranes; mallard; snow geese;

  • 6

flock; crows; seagulls; kitties; peacocks; seagull; rook; magpies; blackbirds; warbler; kittens;

  • ….

[C. Liu, 2013]

slide-35
SLIDE 35

NELL Summary

  • Learning

– Coupled multi-task, multi-view semi-supervised training

  • Inference

– Data mine the KB to learn inference rules – Scalable any-time inference via random walks

  • Representation

– Ontology extension

  • Cluster pairs of noun phrases, based on corpus statistics
  • Directly read that X is a subcategory of Y

– Infer millions of latent concepts from observable text

  • Curriculum

– learn easiest things first, build on those to “learn to learn”

slide-36
SLIDE 36

Key Idea 4: Cumulative, Staged Learning

1. Classify noun phrases (NP’s) by category 2. Classify NP pairs by relation 3. Discover rules to predict new relation instances 4. Learn which NP’s (co)refer to which latent concepts 5. Discover new relations to extend ontology 6. Learn to infer relation instances via targeted random walks 7. Learn to assign temporal scope to beliefs 8. Learn to microread single sentences 9. Vision: co-train text and visual object recognition

  • 10. Goal-driven reading: predict, then read to

corroborate/correct

  • 11. Make NELL a conversational agent on Twitter

Learning X improves ability to learn Y

slide-37
SLIDE 37

thank you

and thanks to: Darpa, Google, NSF, Intel, Yahoo!, Microsoft, Fulbright follow NELL on Twitter: @CMUNELL browse/download NELL’s KB at http://rtw.ml.cmu.edu

slide-38
SLIDE 38

What next for NELL?

  • micro-reading [Krishnamurthy, Betteridge]
  • map each sentence to belief system

– agree/disagree/accept

  • beyond English [Hrushka]
  • add computer vision [Gupta, Chen]
  • scalable inference over KB [Cohen, Gardner, Talukdar]
  • merge with Freebase, Yago [Wijaya, Talukdar]
  • goal-driven, targeted reader [Samadi, Kisiel]
  • conversational agent on Twitter [Hrushka, Ritter]
  • map to brain image data on sentence reading [Rafidi]
slide-39
SLIDE 39
  • Questions about this webcast? learning@acm.org
  • ACM Learning Webinars (including archives):

http://learning.acm.org/webinar

  • ACM Learning Center: http://learning.acm.org
  • ACM SIGART: http://www.sigart.org

ACM: The Learning Continues…