Generating and Exploiting Large-scale Pseudo Training Data for Zero - PowerPoint PPT Presentation

Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution Ting Liu † , Yiming Cui ‡ , Qingyu Yin † , Weinan Zhang † , Shijin Wang ‡ and Guoping Hu ‡ † Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, Harbin, China ‡ iFLYTEK Research, Beijing, China

Zero Pronoun (ZP) Zero Pronoun Candidate Antecedent 苹果， <ZP> 非常小明吃了一个甜 Xiaoming eats an apple, it is very sweet Overt Pronoun Candidate Antecedent 2

ZP Proportion English Chinese Overt Pronoun Zero Pronoun Overt Pronoun Zero Pronoun [1]Kim Y J. Subject/object drop in the acquisition of Korean: A cross-linguistic comparison[J]. Journal of East Asian Linguistics, 2000, 9(4): 325-351. [2]Zhao S, Ng H T. Identification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach[C]// EMNLP-CoNLL 2007. 3

Zero Pronoun Resolution (ZPR) Zero Pronoun Resolution 苹果， <ZP> 非常小明吃了一个甜 Xiaoming eats an apple, it is very sweet Overt Pronoun Resolution 4

Challenges of ZPR • No overt pronoun for indication • No information for the positions of ZPs • No type/surface information of ZPs • Feature engineering 19 hand-crafted features for ZP 18 hand-crafted features for antecedent Chen Chen and Vincent Ng. 2016. Chinese zero pronoun resolution with deep neural networks. ACL 2016. 5

Solutions • No overt pronoun for indication • Considering all possible positions for ZPs identification • Classifying ZPs to Anaphoric ZPs (AZP) and Non-AZPs Most existing work • Modelling the semantics of ZPs and antecedents • Feature engineering • Automatically learning to represent features • Deep learning approaches for the modeling This paper • More labeled data for training 6

How to Obtain Large-scale Training Data? • Manual Annotation • Labor consuming • Hard to say “large-scale” • Automatic Generation • Easy to obtain • Large-scale • Pseudo training data 7

Proportion of the number of words in antecedents What is Actual Training Data? • Sample Training Data in OntoNotes 5.0 • Single-word (In Chinese) antecedent Single-word antecedent Multi-word antecedent CN ： [ 警方 ] 怀疑这是一起黑枪案件， zp 1 将枪械交送市里 zp 2 以清理案情。 EN ： [ The police ] suspected that this is a criminal case about illegal guns, zp 1 brought the guns to the city zp 2 to deal with the case. • Multi-word antecedent CN ：这次 [ 近 50 年来印度发生的最强烈地震 ] 震级强， zp 波及范围广，印度邻国如尼泊尔也受到了影响。 EN ： [ The earthquake that is the strongest one occurs in India within recent 50 years ] has a high-magnitude, zp influences a large range of areas, and the neighboring country of India like Nepal is also affected. 8

How to Generate Pseudo Training Data? • Collecting large-scale news documents, which is relevant (or homogenous in some sense) to the OntoNotes 5.0 data. • Given a document D , a word is randomly selected as an answer A if • It is either a noun or pronoun • It should appear at least twice in the document • The sentence contains A is defined as a query Q , in which the answer A is replaced by a specific symbol “ <blank> ” 9

Zero Pronoun Resolution (ZPR) • A pseudo training sample can be represented as < 𝐸, 𝑅, 𝐵 > Pseudo Answer Query Document Actual Context A sentence that contains a ZP An antecedent • Zero pronoun resolution task is thus defined as 𝑄(𝐵|𝐸, 𝑅) 11

Attention-based NN Model for ZPR Single-word Matching the single word Matching the head word Single-word Antecedent Multi-word Antecedent Two-step Training Pseudo Data Pre-training Actual Data Fine-tuning OR General Training Domain Training 12

Experimental Data • OntoNotes Release 5.0 from CoNLL-2012 • Broadcast News (BN), Newswires (NW), Broadcast Conversations (BC), Telephone Conversations (TC), Web Blogs (WB), Magazines (MZ) Pseudo Actual 13

Overall Performance • F-score 14

Effect of UNK Processing 15

Effect of Domain Adaptation 16

Error Analysis • The impact of UNK words CN ： zp unk1 unk2 顶，将 unk3 和 unk4 的美景尽收眼底。 EN ： zp successfully [climbed] unk1 the peak of [Taiping Mountain] unk2 , to have a panoramic view of the beauty of [Hong Kong Island] unk3 and [Victoria Harbour] unk4 . • Long distance between ZPs and antecedents CN ： [ 我 ] 帮不了那个人 … （多于 30 个词） … 那天结束后， zp 回到家中。 EN ： [ I ] can’t help that guy … (more than 30 words) … After that day, zp return home. 17

Conclusion • Generating and exploiting pseudo training data for ZPR • Inspired by the cloze-style reading comprehension • Two-step training of the ZPR model for the use of the large scale pseudo training data • A new State-of-the-Art approach on Chinese ZPR task 18

Thanks! Questions and Advices?

Generating and Exploiting Large-scale Pseudo Training Data for Zero - PowerPoint PPT Presentation

Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution Ting Liu , Yiming Cui , Qingyu Yin , Weinan Zhang , Shijin Wang and Guoping Hu Research Center for Social Computing and Information

H2 F2009 H2 F2009 GENERATING GENERATING GENERATING GENERATING FREE CASH FLOW FREE CASH FLOW

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Advanced Electric Generating Advanced Electric Generating Advanced Electric Generating

Ratchaburi Electricity Generating Holding PCL. Ratchaburi Electricity Generating Holding PCL.

Recursive Definitions Generating Functions Lecture 18 Generating Functions A generating

Generating Precise Dependencies for Large Software Pei Wang, Jinqiu Yang, Lin Tan University of

MIPS Pseudo Instructions and Functions Philipp Koehn 2 October 2019 Philipp Koehn Computer

Completions of Pseudo Ordered Sets Maria D Cruz BLAST 2018 August 10,2018 Maria D Cruz (NMSU)

ECEN 5022 Cryptography Pseudo Random Number Generators Peter Mathys University of Colorado

Models for Inexact Reasoning Reasoning with Subjective Pseudo Reasoning with Subjective Pseudo

Stackable GSS Pseudo-Mechs draft-williams-gssapi-stackable-pseudo-mechs-00

Pseudo-random Functions Debdeep Mukhopadhyay IIT Kharagpur We have seen the construction of

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

Generating Subfields Mark van Hoeij June 15, 2017 Mark van Hoeij Generating Subfields Overview

Input Devices: Trackers, Navigation and Gesture Interfaces Input Devices What is Virtual

Useful Reading Papers about: Fight of the Bumblebees VRML Dream (IEEE CG&A March

Welcome to Zoom lectures: T/Th 10:00 11:20 (recordings on Canvas) Zoom office hours CSE

Gesture Recognition Adrian Kndig adkuendi@student.ethz.ch Datum Informatik II Samstag, 27.

Computers Session 1 INST 346 Agenda The Computer The Course Source: Wikipedia

Object-Oriented Design Lecture 6: Finding Analysis Classes Sharif University of Technology 1

Networked Virtual Environments Networked Virtual Environments Thursdays 8 Thursdays

The Los Alamos Super Vault Type Room May, 2008 Alex Kent Advanced Computing Solutions

Generating and Exploiting Large-scale Pseudo Training Data for Zero - PowerPoint PPT Presentation

Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution Ting Liu , Yiming Cui , Qingyu Yin , Weinan Zhang , Shijin Wang and Guoping Hu Research Center for Social Computing and Information

H2 F2009 H2 F2009 GENERATING GENERATING GENERATING GENERATING FREE CASH FLOW FREE CASH FLOW

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Advanced Electric Generating Advanced Electric Generating Advanced Electric Generating

Ratchaburi Electricity Generating Holding PCL. Ratchaburi Electricity Generating Holding PCL.

Recursive Definitions Generating Functions Lecture 18 Generating Functions A generating

Generating Precise Dependencies for Large Software Pei Wang, Jinqiu Yang, Lin Tan University of

MIPS Pseudo Instructions and Functions Philipp Koehn 2 October 2019 Philipp Koehn Computer

Completions of Pseudo Ordered Sets Maria D Cruz BLAST 2018 August 10,2018 Maria D Cruz (NMSU)

ECEN 5022 Cryptography Pseudo Random Number Generators Peter Mathys University of Colorado

Models for Inexact Reasoning Reasoning with Subjective Pseudo Reasoning with Subjective Pseudo

Stackable GSS Pseudo-Mechs draft-williams-gssapi-stackable-pseudo-mechs-00

Pseudo-random Functions Debdeep Mukhopadhyay IIT Kharagpur We have seen the construction of

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

Generating Subfields Mark van Hoeij June 15, 2017 Mark van Hoeij Generating Subfields Overview

Input Devices: Trackers, Navigation and Gesture Interfaces Input Devices What is Virtual

Useful Reading Papers about: Fight of the Bumblebees VRML Dream (IEEE CG&amp;A March

Welcome to Zoom lectures: T/Th 10:00 11:20 (recordings on Canvas) Zoom office hours CSE

Gesture Recognition Adrian Kndig adkuendi@student.ethz.ch Datum Informatik II Samstag, 27.

Computers Session 1 INST 346 Agenda The Computer The Course Source: Wikipedia

Object-Oriented Design Lecture 6: Finding Analysis Classes Sharif University of Technology 1

Networked Virtual Environments Networked Virtual Environments Thursdays 8 Thursdays

The Los Alamos Super Vault Type Room May, 2008 Alex Kent Advanced Computing Solutions

Useful Reading Papers about: Fight of the Bumblebees VRML Dream (IEEE CG&A March