University of Florida DSR Lab System for KBP Slot Filler Validation - PowerPoint PPT Presentation

University of Florida DSR Lab System for KBP Slot Filler Validation 2015 Miguel Rodriguez, Sean Goldberg, Daisy Wang

Slot Filler Validation Bristol Central High School gpe:schools_atended New England Patriots University of Florida University of Connecticut ABC News Tim Tebow

Slot Filler Validation Truth Bristol Central High T School gpe:schools_atended New England Patriots F University of Florida T University of F Connecticut ABC News F Tim Tebow

Slot Filler Validation Truth Survey Research T Center org:subsidiaries Florida Museum of T Natural History Smithsonian Tropical F Research Institute

Slot Filler Validation - Classification ● Slot Filler Validation is a binary classification task ○ Given a set of queries consisting of tuples of the form <entity, slot> And a set of Slot Fillers for each query ○ Determine if a slot filler is True or False ○

Slot Filler Validation - Classification ● Slot Filler Validation is a binary classification task ○ Given a set of queries consisting of tuples of the form <entity, slot> And a set of Slot Fillers for each query ○ Determine if a slot filler is True or False ○ ● A CSSF output is the output of such classifier ○ Ideal for ensemble classification ○ Aggregate the output of multiple classifiers Outperform the original ones ○

Ensemble Classification ● Ensemble methods have two main parts ○ Inducer : Selects the training data for each individual classifier ○ Combiner : takes the output of each classifier and combine them to formulate a final prediction

Stacked Ensemble Meta-level classifier that takes the output of other models as input and estimate their weights Vidhoon Viswanathan, Nazneen Fatema Rajani, and Yinon Bentor Raymond J Mooney. 2015. Stacked ensembles of information extractors for knowledge-base population. In Proceedings of the 53rd annual meeting on association for computational linguistics. Association for Computational Linguistics

Stacked Ensemble ● Requires labeled data ○ Available from 2013 and 2014 SF and SFV ● Training Strategy Learn from previous year performance ○ 2013-2014: 7 teams ○ ○ 2014: 12 teams

Stacked Ensemble ● Requires labeled data ○ Available from 2013 and 2014 SF and SFV ● Training Strategy Learn from previous year performance ○ 2013-2014: 7 teams ○ ○ 2014: 12 teams ● All runs that can not be fit into the classifier are discarded! ○ Leave out extra evidence … From potentially well ranked systems ○

Stacked Ensemble - not enough! Rank TEAM ID 0-HOP 1-HOP ALL Rank TEAM ID 0-HOP 1-HOP ALL F1 F1 F1 F1 F1 F1 9 SFV2015_SF_03_1 0.3457 0.1154 0.2718 39 SFV2015_KB_10_1 0.1834 0.0952 0.1474 14 SFV2015_KB_16_2 0.2633 0.1655 0.2247 45 SFV2015_KB_09_1 0.0965 0.0791 0.0899 16 SFV2015_SF_18_1 0.292 0.0972 0.2245 47 SFV2015_SF_13_2 0.1225 0 0.0892 24 SFV2015_SF_08_4 0.2669 0.0976 0.2102 56 SFV2015_SF_07_1 0.0512 0 0.0353 31 SFV2015_SF_02_1 0.1883 0.1299 0.1649 63 SFV2015_KB_11_1 0.019 0 0.0121 34 SFV2015_SF_06_1 0.2351 0 0.1595 64 SFV2015_SF_17_1 0.019 0 0.0121 F1 score ranking of 2014-2015 teams.

Consensus Maximization Fusion Augment stacked ensemble model by adding more meta-classifiers

Consensus Maximization Fusion Add runs that can not fit into the stacked ensemble method. We treat these runs as 2-Class Clusters

Consensus Maximization Fusion Jing Gao, Feng Liang, Wei Fan, Yizhou Sun, and Jiawe Han. 2009. Graph-based consensus maximization among multiple supervised and unsupervised models. In Advances in Neural Information Processing Systems, pp 585–593.

Consensus Max. Fusion - Example ● Consider the following queries ○ O1 = (Marion Hammer, per:title, president) O2 = (Dublin, gpe:headquarters_in_city,trinity college) ○

Consensus Max. Fusion - Example Meta-Classifiers: 6 Yes – 0 No Meta-Classifiers: 0 Yes – 6 No Clusters: 46 Yes - 16No Clusters: 34 Yes - No 28

Consensus Max. Fusion ● Combine outputs of multiple supervised and unsupervised models for better classification. ● The predicted labels should agree with the base supervised models but adds unsupervised evidence. ● Model combination at output level is needed in KBP applications where there is no access to individual extractors.

Consensus Maximization Fusion Pipeline

Mapping ● Runs from teams that participated in previous years are mapped together and ranked using the corresponding assessments. ● 2015 runs, are ranked based on the small assessment file provided for the task. ● The best run of each mapped team is then passes to the feature extraction module. ● All other runs are passed directly to BGCM.

Feature Extraction ● Same as the SFV Stack Ensemble System ○ Probabilities Relation ○ Provenance ○

Post-processing ● Filter ensemble of all 0–hop queries ○ Enforce single-values relations by selecting the one with highest probability For every slot filler classified as true, select the provenance of the slot ○ filler with highest probability. ● For every 1-hop query in the ensemble ○ Enforce its 0-hop result is in the ensemble

Submitted Runs ● 2013-2014: Run 1 Meta-classifiers trained with samples from 7 teams. ○ BGCM: 6 meta-classifiers and 62 runs ○ ● 2014: Run 2 ○ Meta-classifiers trained with samples from 12 teams. ○ BGCM: 6 meta-classifiers and 57 runs ● Run 3 Use all meta classifiers from Runs 1 and 2 ○ BGCM: 12 meta-classifiers and 57 runs ○

Results - 2015 CSSF

Analysis Run 2 The majority of the slot fillers included in our best run come from unsupervised consensus

Analysis Run 2 ● Answers come from unsupervised consensus ○ All supervised outputs classified them as negative ○ Not enough evidence ● As more unsupervised runs reach consensus, there are more correct than incorrect fillers. ● The Recall of the system is improved

Analysis Run 2 ● At least one stacked ensemble model classified as positive. ● Supervised evidence helps improve precision. ● The higher the consensus with the unsupervised clusters the system filters better.

Questions?

University of Florida DSR Lab System for KBP Slot Filler Validation - PowerPoint PPT Presentation

University of Florida DSR Lab System for KBP Slot Filler Validation 2015 Miguel Rodriguez, Sean Goldberg, Daisy Wang Slot Filler Validation Bristol Central High School gpe:schools_atended New England Patriots University of Florida

Overview of the KBP 2015 Slot Filler Validation Track Hoa Trang Dang National Institute of

Components of ESTELITE OMEGA 1. Filler - Supra-Nano Spherical Filler (200nm SiO 2 -ZrO 2 ) -

Capacity Market DSR Workshop Tuesday 22 March 2016 Capacity Market: DSR Workshop Agenda 22

KBP 2017 Cold Start KB Construction and Slot Filling Hoa Dang Shahzad Rajput U.S. National

2/8/2013 The Slot Filling Challenge Overview of the NYU 2011 System Pattern Filler Ang

Overview of Event Nugget Track TAC KBP 2016 Teruko Mitamura Zhengzhong Liu Eduard Hovy

Sustainable Development Advisory Group Demand-side response Colin Sausman What is DSR?

Events Detection, Coreference and Sequencing: Whats next? Overview of TAC KBP 2017 Event

Overview of the TAC2011 Knowledge Base Population (KBP) Track Heng Ji, Ralph Grishman and Hoa

Background It is no broad agreement on DSR terminology, theory, methodology, evaluation

The Design of Slot Machine Games Kevin Harrigan, PhD University of Waterloo Nov 17, 2009, New

Om-Omission and Filler-Gap Dependencies Gosse Bouma Centre for Language and Cognition University

Distributed Series Reactor An overview of the conductor impacts of the DSR Joseph Goldenburg,

Wireless networks Routing: DYMO 1 AODV-DSR: Comparison Many studies in the literature

Stanford-UBC at TAC-KBP Eneko Agirre , Angel Chang, Dan Jurafsky, Christopher Manning, Valentin

How to book a slot to record your presentation Please answer the e-mail you have received and

Shaping the Future of Warehouse Operations Dr Tony McVeigh MORE WITH LESS ! 2 ORDER PICKING 3

31st Voorburg Group Meeting Croatia September, 2016 Mini-presentation CPA 59 Motion picture, video

Worker C Classif ifica catio ion: n: Inde depend ndent nt Con ontractor or vs. vs.

Professor Dyfrig Hughes 1 Overall aim of the ABC project Produce evidence-based

Warren BRF 013-4(32) Bridge 166 on VT 100 Over the Mad River Alternatives Presentation PROJECT

CDOTs Flood Recovery Program LOCAL AGENCY UPDATES September 18, 2014 AGENDA PR Task Order

Community Service Grants Fiscal Year 2021 Agenda Purpose of the CSG Program Process and

O CTOBER 5, 2017 1 E STABLISHING THE T AX R ATE Real Property Taxation: Establishing the Tax