Natural Language Processing Syntactic Models Machine Translation III - PDF document

Natural Language Processing Syntactic Models Machine Translation III Dan Klein – UC Berkeley 1

Syntactic Decoding 5

Soft Syntactic MT: From Chiang 2010 Flexible Syntax Hiero Rules From [Chiang et al, 2005] 9

Lots to Parse Exploiting GPUs ≈ 2.6 billion words 11

Lots to Parse Lots to Parse ≈ 6 months (CPU) ≈ 3.6 days (GPU) CPU Parsing CPU Parsing [Petrov & Klein, 2007] • NLP algorithms achieve speed by exploiting sparsity. >98% sparsity Grammar × S ××× NP VP Skip Spans Skip Rules Slide credit: Slav Petrov CPU Parsing CPU Parsing CPU CPU 12

CPU Parsing The Future of Hardware CPU CPU The Future of Hardware The Future of Hardware The Future of Hardware The Future of Hardware 16384 32 Threads 13

The Future of Hardware Warps add.s32 %r1, %r631, %r0; ld.global.f32 %f81, [%r1]; ld.global.f32 %f82, [%r34]; mul.ftz.f32 %f94, %f82, %f81; mov.f32 %f95, 0f3E002E23; mov.f32 %f96, 0f00000000; mad.f32 %f93, %f94, %f95, %f96; shl.b32 %r2, %r646, 8; add.s32 %r3, %r658, %r2; shl.b32 %r4, %r3, 2; add.s32 %r5, %r631, %r4; mul.lo.s32 %r6, %r646, 588; shl.b32 %r7, %r6, 1; add.s32 %r8, %r5, %r7; ld.global.f32 %f83, [%r8]; mul.ftz.f32 %f98, %f82, %f83; Warp Warp Warps Warps Warp Divergence Warps Warps Warp Divergence 14

Warps Warps ✔ ✗ Coalescence Warp Divergence Designing GPU Algorithms Designing GPU Algorithms CPU GPU Irregular, Regular, Sparse Dense Warp Coalescence × × ××× Dense, Uniform Computation Designing GPU Algorithms Designing GPU Algorithms CPU GPU Irregular, Regular, Sparse Dense × × × ××× ××× CKY Algorithm [Canny, Hall, and Klein, 2013] 15

CKY Parsing CKY Parsing for each sentence: for each sentence: Item Queue Item Queue for each span (begin, end): for each span (begin, end): for each split: for each split: for each rule (P ‐ > L R): score[begin, end, P] Grammar Grammar += ruleScore[P ‐ > L R] applyGrammar(begin, split, end) Application Application * score[begin, split, L] * score[split, end, R] CKY Parsing CKY Parsing Item Queue CPU for each parse item in sentence: for each parse item in sentence: Grammar applyGrammar(item) applyGrammar(item) GPU Application GPU Parsing Pipeline Parsing Speed CPU GPU CPU Queue Grammar 10 s/sec (i, k, j) S (0, 1, 3) (0, 1, 3) NP VP GPU (0, 2, 3) 190 s/sec 3 (1, 2, 4) 2 (1, 3, 4) 0 100 200 300 400 500 … Sentences per second [Canny, Hall, and Klein, 2013] 16

Exploiting Sparsity Exploiting Sparsity Grammar Grammar Grammar × S S S ××× NP VP NP VP NP VP CPU Queuing GPU Application GPU Application GPU Application Exploiting Sparsity Exploiting Sparsity (0, 1, 3) (0, 1, 3) S NP VP PP … (0, 2, 3) (0, 2, 3) S NP VP PP … (1, 2, 4) (1, 2, 4) S NP VP PP … (1, 3, 4) (1, 3, 4) S NP VP PP … 3 (2, 3, 5) (2, 3, 5) S NP VP PP … 2 (2, 4, 5) (2, 4, 5) S NP VP PP … (3, 4, 6) (3, 4, 6) S NP VP PP … … … Warp Exploiting Sparsity Exploiting Sparsity Grammar S NP VP GPU Application Warp Divergence 17

Exploiting Sparsity Exploiting Sparsity CPU GPU Queue NP VP PP S NP Queue (i, k, j) (i, k, j) (i, k, j) (i, k, j) (i, k, j) NP NP (i, k, j) NP NP VP (0, 1, 3) (0, 1, 3) (0, 1, 3) (0, 1, 3) (0, 1, 3) S PP NP NP NP NP NP NP NP (0, 2, 3) (0, 2, 3) (0, 2, 3) (0, 2, 3) (0, 2, 3) NP PP NP PP NP PP NP PP NP PP (0, 1, 3) … … … … … NP VP S PP (0, 2, 3) NP PP VP PP NP VP IN NP (1, 2, 4) (1, 3, 4) … Exploiting Sparsity Parsing Speed CPU GPU CPU VP Queue 10 s/sec NP NP (i, k, j) NP NP VP NP GPU Vit. NP VP NP PP PP NP NP PP VP NP 405 s/sec (0, 1, 3) GPU Min (0, 2, 3) Risk (1, 2, 4) 190 s/sec (1, 3, 4) 0 100 200 300 400 500 … 18

Natural Language Processing Syntactic Models Machine Translation III - PDF document

Natural Language Processing Syntactic Models Machine Translation III Dan Klein UC Berkeley 1 2 3 4 Syntactic Decoding 5 6 7 8 Soft Syntactic MT: From Chiang 2010 Flexible Syntax Hiero Rules From [Chiang et al, 2005] 9 10 Lots to

Natural Language Processing: Natural Language Processing: Introduction to Syntactic Parsing

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing Machine Translation III Dan Klein UC Berkeley 1 Syntactic Models

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 8: Compositional semantics and discourse processing Katia

Syntactic Processing: Parts-of-Speech Tagging CSE354 - Spring 2020 Task Syntactic

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2019 Natural Language

Outline Information Retrieval (IR) Syntactic IR Problems of Syntactic IR Semantic

Chapter 3: Syntactic Forms, Grammatical Functions, and Semantic Roles Syntactic Constructions in

Compact binary coalescence parameter estimations for 2.5 post- Newtonian aligned spinning

Register Allocation Akim Demaille tienne Renault Roland Levillain first . last @lrde.epita.fr

Omnithermal perfect simulation for M / G / c queues Stephen Connor

future Willem van Deursen Carthago Consultancy Deltares 11 March 2016 Climate Change Research

The FCRM Review Tom Dauben Environment Agency Flood and Coastal Risk Management 19 th November

Welcome to TELEMAC-MASCARET users conference Toulouse, 15th 17th october 2019 @CERFACS

A relaxation framework for morphodynamics modelling E. Audusse . LAGA, UMR 7569, Univ. Paris 13

Hyperspectral survey method to detect the titanium dioxide percentage in the coatings applied to