Direct Assessment Yvette Graham August 11, 2016 Direct Assessment - PowerPoint PPT Presentation

Direct Assessment First Conference on Machine Translation (WMT), August 2016 Direct Assessment Yvette Graham August 11, 2016

Direct Assessment First Conference on Machine Translation (WMT), August 2016 Direct Assessment (DA) I • Consideration being given to using DA alone for next year Reasons: • High correlation between RR and DA • It seems like we could get good clusters with (conservatively) half the annotation time • Computed as follows: • we require 100 hits per system submission, average 5 min per hit, so 500 minutes = 8 hours • DA at 500 translations is what we might need (maybe more in some cases), and that takes about 2.5 hours (half hour per hit)

Direct Assessment First Conference on Machine Translation (WMT), August 2016 Direct Assessment (DA) II • English side can be completely crowdsourced • Leaves researchers responsible only for tasks where we can’t find crowdsourced workers

Direct Assessment First Conference on Machine Translation (WMT), August 2016 Correlation of RR and DA 0.997 cs-en 0.996 fi-en 0.988 tr-en 0.964 de-en 0.961 ru-en 0.920 ro-en en-ru 0.975

Direct Assessment First Conference on Machine Translation (WMT), August 2016 # Human Assessment vs Significant Differences: CS-EN 100 80 % Significant Differences 60 40 DA 20 RR (with de−dup.) 0 500 1000 1500 2000 2500 Assessed Translations per System

Direct Assessment First Conference on Machine Translation (WMT), August 2016 DE-EN 100 80 % Significant Differences 60 40 DA 20 RR (with de−dup.) 0 500 1000 1500 2000 2500 3000 Assessed Translations per System

Direct Assessment First Conference on Machine Translation (WMT), August 2016 FI-EN 100 80 % Significant Differences 60 40 DA 20 RR (with de−dup.) 0 500 1000 1500 2000 2500 3000 Assessed Translations per System

Direct Assessment First Conference on Machine Translation (WMT), August 2016 RO-EN 100 80 % Significant Differences 60 40 DA 20 RR (with de−dup.) 0 500 1000 1500 2000 Assessed Translations per System

Direct Assessment First Conference on Machine Translation (WMT), August 2016 RU-EN 100 80 % Significant Differences 60 40 DA 20 RR (with de−dup.) 0 500 1000 1500 2000 2500 3000 3500 Assessed Translations per System

Direct Assessment First Conference on Machine Translation (WMT), August 2016 TR-EN 100 80 % Significant Differences 60 40 DA 20 RR (with de−dup.) 0 500 1000 1500 2000 Assessed Translations per System

Direct Assessment First Conference on Machine Translation (WMT), August 2016 EN-RU 100 80 % Significant Differences 60 40 DA 20 RR (with de−dup.) 0 200 400 600 800 1000 1200 Assessed Translations per System

Direct Assessment First Conference on Machine Translation (WMT), August 2016 Conclusions • Trial of DA was successful overall • No problems crowd-sourcing all to-English language pairs • Not enough workers for all out-of-English news LPs except English to Russian – those LPs unfortunately must remain the task of participants • Correlation with RR high across the board • DA almost achieves as many significant differences as RR but without deduplication More to come: • WMT’16 included RR with deduplication and DA without it – Makes the comparison of numbers of judgments difficult • Future: Compare unexpanded (undeduped) versions to see what effect it had, since this is really an unfair comparison

Direct Assessment Yvette Graham August 11, 2016 Direct Assessment - PowerPoint PPT Presentation

Direct Assessment First Conference on Machine Translation (WMT), August 2016 Direct Assessment Yvette Graham August 11, 2016 Direct Assessment First Conference on Machine Translation (WMT), August 2016 Direct Assessment (DA) I

Great Lakes Chloride, Inc. Direct Liquid Application (DLA) Direct Liquid Application (DLA)

State of Collaboration Direct Deposit and Payroll Reissuance 1 1 Topics Direct Deposit

Direct loan Direct loan Information Information Feder deral Direct Student Loans l Direct

NJ Direct 10/Aetna Freedom 10 v. Horizon Direct Access 10 Brown & Brown Benefit Advisors 1

Creating Dashboards of Direct and Creating Dashboards of Direct and Creating Dashboards of Direct

Direct Link Networks Direct Link Networks 10/11/06 UIUC - CS/ECE438, Fall 2006 2 Direct Link

Student Assessment in Scarsdale Education Report November, 2016 Assessment Defined Purposes

Direct Evidence of Student Learning Sharlene Sayegh 96 Direct Assessment SHARLENE SAYEGH

GEOTHERMAL SYSTEMS AND TECHNOLOGIES 1. DIRECT USE OF GEOTHERMAL ENERGY 6. DIRECT USE OF

DIRECT TO CONSUMER EXPERIENCES LINDSAY RICE : VP OF DIRECT TO CONSUMER DIRECT TO CONSUMER VISION

Direct Wage Hiring Process STAFFING Direct Wage NOEs A Direct Wage NOE must be received at HR

DIRECT STUDENT LOANS Processing Loans at Joliet Junior College Direct Student Loans Money you

+ TV Direct PCL. Opportunity Day Q2, 2014 TV Direct Public Company Limited Agenda +

Special Education August 23, 2016 Colleen Dalrymple, Director Eli Freund, Supervisor IEP Direct

LEVEL OPEN CONDITION:- 1 ID DIRECT 10000 = 1 LEVEL 1 ID DIRECT 25000 = 3 LEVEL 1 ID

Syntax Review NOUNS VERBS NOUNS VERBS Subject Subject Direct Command Direct Command

CS4617 Computer Architecture Lecture 5: Memory Hierarchy 3 Dr J Vaughan September 22, 2014 1/37

Slides for Lecture 9 ENCM 501: Principles of Computer Architecture Winter 2014 Term Steve

Automated Placement for Custom Digital Designs Tung-Chieh Chen Physical Design Group, SpringSoft

Negotiability In Depth: Management Rights and Beyond August 17, 2017 Slide 2 Disagreement

Cache Performance and Set Associative Cache Lecture 12 CDA 3103 06-30-2014 5.1 Introduction

GMC MCB B St Stat atut utor ory y Aut uthority hority Vermont Information Technology

NOW Handout Page 1 CS258 S99 1 Physi sical al Mem is 2 41 41 or Page size is 2 13 13 or 8Kb

FSU DEPARTMENT OF COMPUTER SCIENCE Humb oldt-Universit at zu Berlin Flo rida State

Sambuz

Useful Links

Newsletter

Mail Us

Direct Assessment Yvette Graham August 11, 2016 Direct Assessment - PowerPoint PPT Presentation

Direct Assessment First Conference on Machine Translation (WMT), August 2016 Direct Assessment Yvette Graham August 11, 2016 Direct Assessment First Conference on Machine Translation (WMT), August 2016 Direct Assessment (DA) I

Great Lakes Chloride, Inc. Direct Liquid Application (DLA) Direct Liquid Application (DLA)

State of Collaboration Direct Deposit and Payroll Reissuance 1 1 Topics Direct Deposit

Direct loan Direct loan Information Information Feder deral Direct Student Loans l Direct

NJ Direct 10/Aetna Freedom 10 v. Horizon Direct Access 10 Brown &amp; Brown Benefit Advisors 1

Creating Dashboards of Direct and Creating Dashboards of Direct and Creating Dashboards of Direct

Direct Link Networks Direct Link Networks 10/11/06 UIUC - CS/ECE438, Fall 2006 2 Direct Link

Student Assessment in Scarsdale Education Report November, 2016 Assessment Defined Purposes

Direct Evidence of Student Learning Sharlene Sayegh 96 Direct Assessment SHARLENE SAYEGH

GEOTHERMAL SYSTEMS AND TECHNOLOGIES 1. DIRECT USE OF GEOTHERMAL ENERGY 6. DIRECT USE OF

DIRECT TO CONSUMER EXPERIENCES LINDSAY RICE : VP OF DIRECT TO CONSUMER DIRECT TO CONSUMER VISION

Direct Wage Hiring Process STAFFING Direct Wage NOEs A Direct Wage NOE must be received at HR

DIRECT STUDENT LOANS Processing Loans at Joliet Junior College Direct Student Loans Money you

+ TV Direct PCL. Opportunity Day Q2, 2014 TV Direct Public Company Limited Agenda +

Special Education August 23, 2016 Colleen Dalrymple, Director Eli Freund, Supervisor IEP Direct

LEVEL OPEN CONDITION:- 1 ID DIRECT 10000 = 1 LEVEL 1 ID DIRECT 25000 = 3 LEVEL 1 ID

Syntax Review NOUNS VERBS NOUNS VERBS Subject Subject Direct Command Direct Command

CS4617 Computer Architecture Lecture 5: Memory Hierarchy 3 Dr J Vaughan September 22, 2014 1/37

Slides for Lecture 9 ENCM 501: Principles of Computer Architecture Winter 2014 Term Steve

Automated Placement for Custom Digital Designs Tung-Chieh Chen Physical Design Group, SpringSoft

Negotiability In Depth: Management Rights and Beyond August 17, 2017 Slide 2 Disagreement

Cache Performance and Set Associative Cache Lecture 12 CDA 3103 06-30-2014 5.1 Introduction

GMC MCB B St Stat atut utor ory y Aut uthority hority Vermont Information Technology

NOW Handout Page 1 CS258 S99 1 Physi sical al Mem is 2 41 41 or Page size is 2 13 13 or 8Kb

FSU DEPARTMENT OF COMPUTER SCIENCE Humb oldt-Universit at zu Berlin Flo rida State

Sambuz

Useful Links

Newsletter

Mail Us

NJ Direct 10/Aetna Freedom 10 v. Horizon Direct Access 10 Brown & Brown Benefit Advisors 1