TAC 2015 Cold Start Knowledge Base Population James Mayfield - - PowerPoint PPT Presentation

tac 2015 cold start
SMART_READER_LITE
LIVE PREVIEW

TAC 2015 Cold Start Knowledge Base Population James Mayfield - - PowerPoint PPT Presentation

TAC 2015 Cold Start Knowledge Base Population James Mayfield Johns Hopkins University Whats New 2015 Slot Filling Variant - Supersedes Slot Filling track New Variant - Entity Discovery. Produce skeleton KB with only entities and


slide-1
SLIDE 1

TAC 2015 Cold Start

Knowledge Base Population James Mayfield

Johns Hopkins University

slide-2
SLIDE 2

What’s New 2015

  • Slot Filling

Variant - Supersedes Slot Filling track

  • New

Variant - Entity Discovery. Produce skeleton KB with only entities and types

  • Not these slides
slide-3
SLIDE 3

The Task

Knowledge Base Variant

Schema

per:children per:other_family per:parents per:siblings per:spouse per:employee_of per:member_of per:schools_attend When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels.

When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels.

You are given:

slide-4
SLIDE 4

Homer Simpson Bart Simpson Lisa Simpson Marge Simpson Springfield Elementary Springfield

Bottomless Pete, Nature’s Cruelest Mistake

per:children per:children per:alternate_names per:cities_of_residence per:spouse per:schools_attended

Schema

per:children per:other_family per:parents per:siblings per:spouse per:employee_of per:member_of per:schools_attend When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels. When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels.

When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one of the R-rated european adult movies available on their cable channels.

You Must Produce:

slide-5
SLIDE 5

Homer Simpson Bart Simpson Lisa Simpson Marge Simpson Springfield Elementary Springfield

Bottomless Pete, Nature’s Cruelest Mistake

per:children per:children per:alternate_names per:cities_of_residence per:spouse per:schools_attended

How do you know that your KB is any good?

slide-6
SLIDE 6

Homer Simpson Bart Simpson Lisa Simpson Marge Simpson Springfield Elementary Springfield

Bottomless Pete, Nature’s Cruelest Mistake

per:children per:children per:alternate_names per:cities_of_residence per:spouse per:schools_attended

Where did the children of Marge Simpson go to school? per:children per:schools_attended

slide-7
SLIDE 7

Homer Simpson Bart Simpson Lisa Simpson Marge Simpson Springfield Elementary Springfield

Bottomless Pete, Nature’s Cruelest Mistake

per:children per:children per:alternate_names per:cities_of_residence per:spouse per:schools_attended

When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one

  • f the R-rated

european adult movies available on their cable channels. After two years in the academic quagmire of Springfield Elementary, Lisa finally has a teacher that she connects with. But she soon learns that the problem with being middle-class is that

slide-8
SLIDE 8

Entity-Valued lued Relations Relation Inverse(s)

per:children per:parents per:other_family per:other_family per:parents per:children per:siblings per:siblings per:spouse per:spouse per:employee_or_member_of {org,gpe}:employees_or_members* per:schools_attended

  • rg:students*

per:city_of_birth gpe:births_in_city* per:stateorprovince_of_birth gpe:births_in_stateorprovince* per:country_of_birth gpe:births_in_country* per:cities_of_residence gpe:residents_of_city* per:statesorprovinces_of_residence gpe:residents_of_stateorprovince* per:countries_of_residence gpe:residents_of_country* per:city_of_death gpe:deaths_in_city* per:stateorprovince_of_death gpe:deaths_in_stateorprovince* per:country_of_death gpe:deaths_in_country*

  • rg:shareholders

{per,org,gpe}:holds_shares_in*

  • rg:founded_by

{per,org,gpe}:organizations_founded*

  • rg:top_members_employees

per:top_member_employee_of* {org,gpe*}:member_of

  • rg:members
  • rg:members

{org,gpe*}:member_of

  • rg:parents

{org,gpe*}:subsidiaries

  • rg:subsidiaries
  • rg:parents
  • rg:city_of_headquarters

gpe:headquarters_in_city*

  • rg:stateorprovince_of_headquarters

gpe:headquarters_in_stateorprovince*

  • rg:country_of_headquarters

gpe:headquarters_in_country*

  • rg:country_of_headquarters

gpe:headquarters_in_country*

slide-9
SLIDE 9

String-F ng-Filled Relations

per:alternate_names

  • rg:alternate_names

per:date_of_birth

  • rg:political_religious_affiliation

per:age

  • rg:number_of_employees_members

per:origin

  • rg:date_founded

per:date_of_death

  • rg:date_dissolved

per:cause_of_death

  • rg:website

per:title per:religion per:charges

slide-10
SLIDE 10

Sample Evaluation Query

<query id="CS15_ENG_0954"> <name>McDonald's</name> <docid>5637c4092dffaf2b94448f365f2f6fb3</docid> <beg>403</beg> <end>412</end> <name>Mickey D's</name> <docid>542f252fac77c3b1015ed1646614478f</docid> <beg>439</beg> <end>448</end> <name>McD’s</name> <docid>NYT_ENG_20130625.0068</docid> <beg>4290</beg> <end>4294</end> <name>McDonald's Corp.</name> <docid>69735cd30f0fead03ae6b6f031588271</docid> <beg>649</beg> <end>664</end> <name>MCD</name> <docid>da5778f5d9c2b0939ea206a85518ba87</docid> <beg>495</beg> <end>497</end> <enttype>org</enttype> <slot0>org:employees_or_members</slot0> <slot1>per:date_of_birth</slot1> </query>

Multiple Entrypoints Up to Two Slots (called hop 0 and hop 1) Location of Mention in Document

slide-11
SLIDE 11

Query Decomposition

<query id="CS15_ENG_0954"> <name>McDonald's</name> <docid>5637c4092dffaf2b94448f365f2f6fb3</docid> <beg>403</beg> <end>412</end> <name>Mickey D's</name> <docid>542f252fac77c3b1015ed1646614478f</docid> <beg>439</beg> <end>448</end> <name>McD’s</name> <docid>NYT_ENG_20130625.0068</docid> <beg>4290</beg> <end>4294</end> <name>McDonald's Corp.</name> <docid>69735cd30f0fead03ae6b6f031588271</docid> <beg>649</beg> <end>664</end> <name>MCD</name> <docid>da5778f5d9c2b0939ea206a85518ba87</docid> <beg>495</beg> <end>497</end> <enttype>org</enttype> <slot0>org:employees_or_members</slot0> <slot1>per:date_of_birth</slot1> </query> <query id="CSSF15_ENG_1781079fa5"> <name>McD’s</name> <docid>NYT_ENG_20130625.0068</docid> <beg>4290</beg> <end>4294</end> <enttype>org</enttype> <slot0>org:employees_or_members</slot0> <slot1>per:date_of_birth</slot1> </query> <query id="CSSF15_ENG_178f4789a5"> <name>McDonald's</name> <docid>5637c4092dffaf2b94448f365f2f6fb3</ docid> <beg>403</beg> <end>412</end> <enttype>org</enttype> <slot0>org:employees_or_members</slot0> <slot1>per:date_of_birth</slot1> </query> <query id="CSSF15_ENG_23d20fa825"> <name>McDonald's Corp.</name> <docid>69735cd30f0fead03ae6b6f031588271</ docid>

slide-12
SLIDE 12

Query Entity First Relation

Second Relation

Sun Bo per:date_of_birth Sun Bo per:employee_or_ member_of

  • rg:students

Sun Bo per:employee_or_ member_of

  • rg:country_of_

headquarters Chattanooga gpe:residents_of_city per:city_of_birth Facebook

  • rg:subsidiaries
  • rg:alternate_names

TECO Energy

  • rg:top_members_

employees per:religion Sally Ride per:spouse per:title

More Sample 2015 Evaluation Queries

slide-13
SLIDE 13

Slot Counts in CS Queries

0" 20" 40" 60" 80" 100" 120" 140"

gpe:births_in_city" gpe:births_in_country" gpe:births_in_stateorprovince" gpe:deaths_in_city" gpe:deaths_in_country" gpe:deaths_in_stateorprovince" gpe:employees_or_members" gpe:headquarters_in_city" gpe:headquarters_in_country" gpe:headquarters_in_stateorprovince" gpe:holds_shares_in" gpe:member_of" gpe:organiza@ons_founded" gpe:residents_of_city" gpe:residents_of_country" gpe:residents_of_stateorprovince" gpe:subsidiaries"

  • rg:alternate_names"
  • rg:city_of_headquarters"
  • rg:country_of_headquarters"
  • rg:date_dissolved"
  • rg:date_founded"
  • rg:employees_or_members"
  • rg:founded_by"
  • rg:holds_shares_in"
  • rg:member_of"
  • rg:members"
  • rg:number_of_employees_members"
  • rg:organiza@ons_founded"
  • rg:parents"
  • rg:poli@cal_religious_affilia@on"
  • rg:shareholders"
  • rg:stateorprovince_of_headquarters"
  • rg:students"
  • rg:subsidiaries"
  • rg:top_members_employees"
  • rg:website"

per:age" per:alternate_names" per:cause_of_death" per:charges" per:children" per:ci@es_of_residence" per:city_of_birth" per:city_of_death" per:countries_of_residence" per:country_of_birth" per:country_of_death" per:date_of_birth" per:date_of_death" per:employee_or_member_of" per:holds_shares_in" per:organiza@ons_founded" per:origin" per:other_family" per:parents" per:religion" per:schools_aCended" per:siblings" per:spouse" per:stateorprovince_of_birth" per:stateorprovince_of_death" per:statesorprovinces_of_residence" per:@tle" per:top_member_employee_of"

Hop"0" Hop"1"

slide-14
SLIDE 14
slide-15
SLIDE 15

Slot Filling Variant

slide-16
SLIDE 16

Slot Filling Variant

  • Participants receive evaluation queries
  • Run slot filling system to find fills for first

query slot

  • Run slot filling system again, using each first

round slot fill as starting point for second query slot

  • Result is set of slot fills identical to applying

evaluation queries to KB

slide-17
SLIDE 17

When Lisa's mother Marge Simpson went to a weekend getaway at Rancho Relaxo, the movie The Happy Little Elves Meet Fuzzy Snuggleduck was one

  • f the R-rated

european adult movies available on their cable channels. The waiting is over! The BBC today revealed that the companion for the next series of Doctor Who will be Marge

  • Simpson. 34 year old

Marge will star alongside new Doctor Matt Smith when the TARDIS returns to BBC One in spring 2010. Marge's son Bart Simpson has said "Doctor Who is cool, <query id="Simpsons_001"> <name>Marge Simpson</name> <slot0>per:children</slot0> <slot1>per:schools_attended</slot1> </query> Simpsons_001 per:children Bart Simpson Simpsons_001 per:children Lisa

slide-18
SLIDE 18

<query id="Simpsons_001_abc"> <name>Bart Simpson</name> <slot0>per:schools_attended</slot1> </query> <query id="Simpsons_001_def"> <name>Lisa</name> <slot0>per:schools_attended</slot1> </query> Simpsons_001 per:children Bart Simpson Simpsons_001 per:children Lisa

slide-19
SLIDE 19

<query id="Simpsons_001_abc"> <name>Bart Simpson</name> <slot0>per:schools_attended</slot1> </query> <query id="Simpsons_001_def"> <name>Lisa</name> <slot0>per:schools_attended</slot1> </query> After two years in the academic quagmire of Springfield Elementary, Lisa finally has a teacher that she connects with. But she soon learns that the problem with being middle-class is that Simpsons_001_def per:schools_attended Springfield Elementary

slide-20
SLIDE 20

How Slot Filling Variant is Different

From Previous Years’ Slot Filling Tasks:

  • Single information need
  • Inverse slots

From Cold Start Knowledge Base variant:

  • Perfect (first round) query name

mentions

  • Cross-document coreference resolution
slide-21
SLIDE 21

Entity Discovery Variant

  • Build a knowledge base with no relations
  • Entities
  • Entity types
  • Entity mentions
  • Direct mapping onto TEDL submission

format

  • Scored by TEDL scorer
slide-22
SLIDE 22

2015 Cold Start Document Collection

<doc id="05cb395b273ec302895b7fd6982eddae"> <headline> </headline> <post author="tom" datetime="2006-06-16T17:24:00" id="p1"> well soon i will be goignto sandeigo lookin gforwar do going to see friends again

  • n vaction yea

</post> <post author="Orchids" datetime="2006-06-18T22:37:00" id="p2"> Terrific! I love San Diego! <img src="http://www.christianforums.com/images/ smilies/clap.gif"/> <img src="http://www1.christianforums.com/attachment.php? attachmentid=64057"/> </post> <post author="VioletAngel" datetime="2006-06-20T23:31:00" id="p3"> San Diego is a great place!! <img src="http://www.christianforums.com/images/ smilies/smile.gif"/> </post> <post author="JPPT1974" datetime="2006-07-05T20:21:00" id="p4"> I heard that it is good as my parents loved it! </post> <post author="bgoddenia" datetime="2006-07-19T01:35:00" id="p5"> cool </post> <post author="JPPT1974" datetime="2006-08-07T20:50:00" id="p6"> Good for you my friend! </post> </doc>

#Docs Type

8,938 ¡Newswire 40,186 ¡Discussion ¡Forum 49,124 ¡Total

slide-23
SLIDE 23

Assessment

  • All CSKB and CSSF responses are pooled and anonymized
  • Responses are grouped by original (multi-entrypoint)

query

  • LDC assesses correctness of each response
  • LDC also groups responses into equivalence classes

(representing the set of different response strings for a given entity)

  • All hop 0 responses to a given query are assessed
  • together. All hop 1 responses stemming from a single

equivalence class that was judged CORRECT are assessed together

slide-24
SLIDE 24

Scoring: Metrics

  • Correct = total number of system output

responses judged correct

  • System = total number of system output

responses

  • Reference = number of single-valued slots with a

correct response + number of equivalence classes for all list-valued slots

  • Recall = Correct / Reference
  • Precision = Correct / System
  • F1 = 2 * Precision * Recall / (Precision + Recall)
  • Average over all queries
slide-25
SLIDE 25

Scoring: Averaging

<query id="CS15_ENG_0954"> <name>McDonald's</name> <docid>5637c4092dffaf2b94448f365f2f6fb3</docid> <beg>403</beg> <end>412</end> <name>Mickey D's</name> <docid>542f252fac77c3b1015ed1646614478f</docid> <beg>439</beg> <end>448</end> <name>McD’s</name> <docid>NYT_ENG_20130625.0068</docid> <beg>4290</beg> <end>4294</end> <name>McDonald's Corp.</name> <docid>69735cd30f0fead03ae6b6f031588271</docid> <beg>649</beg> <end>664</end> <name>MCD</name> <docid>da5778f5d9c2b0939ea206a85518ba87</docid> <beg>495</beg> <end>497</end> <enttype>org</enttype> <slot0>org:employees_or_members</slot0> <slot1>per:date_of_birth</slot1> </query> <query id="CS15_ENG_0954"> <name>McDonald's</name> <docid>5637c4092dffaf2b94448f365f2f6fb3</docid> <beg>403</beg> <end>412</end> <name>Mickey D's</name> <docid>542f252fac77c3b1015ed1646614478f</docid> <beg>439</beg> <end>448</end> <name>McD’s</name> <docid>NYT_ENG_20130625.0068</docid> <beg>4290</beg> <end>4294</end> <name>McDonald's Corp.</name> <docid>69735cd30f0fead03ae6b6f031588271</docid> <beg>649</beg> <end>664</end> <name>MCD</name> <docid>da5778f5d9c2b0939ea206a85518ba87</docid> <beg>495</beg> <end>497</end> <enttype>org</enttype> <slot0>org:employees_or_members</slot0> <slot1>per:date_of_birth</slot1> </query>

Multiple Entrypoints average scores use highest score Multiple Hops micro-averagre scores macro-average scores

slide-26
SLIDE 26

Max-Micro, LDC Response, Hop 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MAX-MICRO LDC-RESPONSE - Hop 0 LDC KB_BBN1 SF_Stanford3 KB_hltcoe5 KB_Stanford2 SF_UGENT_IBCN2 SF_CIS3 KB_NYU1 SF_UMass_IESL1 SF_UTAustin1 SF_BUPT_PRIS1 KB_UMass_IESL1 SF_UWashington1 SF_MSIIPL_THU1 KB_SAFT_ISI1 SF_ZJU_DCD_SF1 SF_IITD_20151 SF_StaRAI20151 KB_ICTCAS_OKN3 SF_CMUML1 Precision Recall F1

slide-27
SLIDE 27

Max-Micro, LDC Response, Hop 1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MAX-MICRO LDC-RESPONSE - Hop 1 L D C S F _ S t a n f

  • r

d 3 K B _ h l t c

  • e

3 K B _ B B N 2 K B _ S t a n f

  • r

d 2 S F _ U W a s h i n g t

  • n

1 K B _ N Y U 2 S F _ C I S 4 S F _ U G E N T _ I B C N 4 S F _ B U P T _ P R I S 1 K B _ U M a s s _ I E S L 4 S F _ U M a s s _ I E S L 1 K B _ S A F T _ I S I 1 S F _ I I T D _ 2 1 5 1 S F _ M S I I P L _ T H U 2 S F _ Z J U _ D C D _ S F 1 K B _ I C T C A S _ O K N 3 S F _ C M U M L 3 S F _ S t a R A I 2 1 5 2 S F _ U T A u s t i n 1 Precision Recall F1

slide-28
SLIDE 28

Max-Micro, LDC Response

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MAX-MICRO LDC-RESPONSE - Hop ALL LDC SF_Stanford3 KB_BBN2 KB_hltcoe3 SF_UGENT_IBCN1 SF_CIS3 KB_Stanford2 KB_NYU2 SF_UWashington1 SF_BUPT_PRIS1 SF_UMass_IESL1 SF_UTAustin1 KB_UMass_IESL4 KB_SAFT_ISI1 SF_MSIIPL_THU1 SF_ZJU_DCD_SF1 SF_IITD_20151 SF_StaRAI20151 KB_ICTCAS_OKN3 SF_CMUML2 Precision Recall F1

slide-29
SLIDE 29

Max-Micro, SunBo, Hop 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MAX-MICRO SUNBO - Hop 0 SF_Stanford2 KB_SAFT_ISI1 KB_Stanford1 KB_hltcoe3 KB_NYU1 KB_BBN2 SF_CIS3 SF_UWashington1 SF_BUPT_PRIS1 SF_UGENT_IBCN3 LDC SF_UTAustin1 KB_UMass_IESL2 SF_ZJU_DCD_SF2 SF_UMass_IESL1 KB_ICTCAS_OKN3 SF_CMUML3 SF_IITD_20151 SF_MSIIPL_THU2 SF_StaRAI20155 Precision Recall F1

slide-30
SLIDE 30

Mean-Macro, LDC Response

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MEAN-MACRO LDC-RESPONSE - Hop ALL LDC SF_Stanford3 KB_BBN4 KB_Stanford2 SF_UGENT_IBCN1 SF_CIS3 KB_hltcoe4 SF_UMass_IESL2 SF_BUPT_PRIS1 KB_UMass_IESL2 KB_NYU4 SF_UWashington1 SF_UTAustin1 SF_MSIIPL_THU1 SF_ZJU_DCD_SF2 SF_IITD_20151 KB_SAFT_ISI1 SF_StaRAI20151 KB_ICTCAS_OKN3 SF_CMUML2 F1

slide-31
SLIDE 31

NIL Detection

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 NIL-DETECTION - Hop ALL L D C K B _ B B N 4 S F _ S t a n f

  • r

d 1 K B _ h l t c

  • e

5 S F _ C I S 2 S F _ U G E N T _ I B C N 4 S F _ U M a s s _ I E S L 3 K B _ S t a n f

  • r

d 2 S F _ B U P T _ P R I S 3 K B _ U M a s s _ I E S L 1 K B _ N Y U 5 S F _ U W a s h i n g t

  • n

1 S F _ U T A u s t i n 1 K B _ S A F T _ I S I 1 S F _ I I T D _ 2 1 5 1 S F _ M S I I P L _ T H U 4 S F _ S t a R A I 2 1 5 2 K B _ I C T C A S _ O K N 3 S F _ C M U M L 3 S F _ Z J U _ D C D _ S F 1 Precision Recall F1

slide-32
SLIDE 32

Relationship between Entry Points, Fills, and F1

0.0000# 0.0500# 0.1000# 0.1500# 0.2000# 0.2500# 0.3000# 0# 2000# 4000# 6000# 8000# 10000# 12000# 14000# 16000# 18000# BBN4# BBN1# BBN2# BBN3# Stanford2# BBN5# hltcoe4# hltcoe5# hltcoe2# hltcoe3# hltcoe1# Stanford1# UMass_IESL2# NYU4# UMass_IESL1# NYU1# UMass_IESL4# NYU5# NYU2# UMass_IESL3# NYU3# SAFT_ISI1# ICTCAS_OKN3# ICTCAS_OKN1# ICTCAS_OKN4# ICTCAS_OKN2#

Entry#points# Fills# Mean#Macro#F1#

slide-33
SLIDE 33

Relationship between Entry Points, Fills and Recall

0.0000# 0.0500# 0.1000# 0.1500# 0.2000# 0.2500# 0.3000# 0.3500# 0# 2000# 4000# 6000# 8000# 10000# 12000# 14000# 16000# 18000# BBN4# BBN3# BBN1# Stanford2# BBN5# BBN2# hltcoe4# hltcoe5# hltcoe2# hltcoe3# hltcoe1# UMass_IESL2# UMass_IESL1# NYU4# NYU1# NYU2# NYU5# NYU3# Stanford1# UMass_IESL4# UMass_IESL3# SAFT_ISI1# ICTCAS_OKN3# ICTCAS_OKN1# ICTCAS_OKN4# ICTCAS_OKN2#

Entry#points# Fills# CSSFJMicroJR#

slide-34
SLIDE 34

Entity Discovery Results: NER + Entity Type

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 KB_BBN5 EDL_lodie1 KB_hltcoe1 KB_NYU4 EDL_ZJU_DCD_EDL1 KB_UMass_IESL1 KB_Stanford2 EDL_Stanford1 KB_ICTCAS_OKN2 KB_SAFT_ISI1 Precision Recall F1

slide-35
SLIDE 35

Entity Discovery Results: NER + Entity Type + Clustering (CEAF)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 K B _ B B N 1 K B _ h l t c

  • e

4 K B _ N Y U 4 K B _ U M a s s _ I E S L 1 E D L _ l

  • d

i e 1 K B _ S t a n f

  • r

d 2 E D L _ S t a n f

  • r

d 1 E D L _ Z J U _ D C D _ E D L 1 K B _ I C T C A S _ O K N 3 K B _ S A F T _ I S I 1 Precision Recall F1

slide-36
SLIDE 36

2015 Cold Start Approaches and Themes

  • Meaningful confidence values, derived from
  • statistical systems
  • confidence on patterns
  • weighting number of rules to find relation
  • regression models
  • cosine similarity
  • Heavy use of outside tools for some phases
  • CoreNLP

, NLTK, Factorie, ReVerb

  • More reliance on in-house tools for cross-document

coreference resolution, relation extraction

slide-37
SLIDE 37

2015 Cold Start Approaches and Themes, cont.

  • Wide range of (mostly machine learning) approaches to

extraction

  • Patterns
  • Universal schema patterns
  • Distant supervision
  • Logistic classification
  • Bootstrapping
  • Gradient descent
  • Maxent
  • SVMs
  • CRFs
  • Bagging
  • Open IE
  • RNNs, CNNs (nowhere is safe)
slide-38
SLIDE 38

2015 Cold Start Approaches and Themes, cont.

  • Commonly used resources
  • Freebase
  • Wikipedia
  • GeoNames
  • WordNet
  • Post-Processing
  • Location normalization
  • Location inference
  • Date normalization
  • Sanity checks
  • Confidence values
  • Normative values
  • Knowledge base
slide-39
SLIDE 39

Next: