Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, - PDF document

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch¨ utze January 23, 2017 1

Working Group MORPH: Morphology This WG is concerned with morphology, one of the core areas of computational linguistics and theoretical linguistics, especially once we’ve overcome English-centric myopia. A lot (everything?) changes when morphology is modeled on the character level, in End2End systems and in the framework of deep learning. • character-level models for morphological analysis • character-level models for morphological generation • in character-level models: What happens to prefixes, suffixes, stems, roots? • subword units: morphologically motivated vs non-morphological – properties, strengths, weaknesses etc. • morphological induction / paradigm completion / discovery of morphological rules: supervised, semisupervised, unsupervised • Do certain types of morphology lend themselves better to character-level models? • inflectional vs. derivational morphology • non-concatenative morphologies • segmentation • language modeling – how to incorporate morphology: input, output, at which level? • insights into human morphology from analyzing neural models? • use character-level representations as a research methodology for morphology: e.g., compositional (“dearly”) vs noncompositional (“early”) forms • efficiency (character-based worse than word-based?) • inspection, interpretation, analysis, beyond black-box models • evaluation • applications • come up with 1-5 new research directions

Working Group MT: Machine Translation Machine translation is perhaps the biggest success of deep learning in NLP. This WG will be concerned with research questions and challenges for character-level MT. • character NMT • linear NMT • dealing with OOVs and cross-token dependencies (e.g., hierarchy) • localization • beyond LSTM dependencies • transliteration • character-level alignment • multilingual NMT • multi-task NMT for multiple modalities • document level NMT • What are the units: characters, BEPs, subwords, words, phrases? • Are there still units? • What happens with syntax? • efficiency (character-based worse than word-based?) • inspection, interpretation, analysis, beyond black-box models • evaluation • applications (e.g., specific (low/high) resource settings / text types / language pairs?) • come up with 1-5 new research directions

Working Group RepLearn: Character-Level Representation Learning “Unsupervised representation learning techniques capitalize on unlabeled data . . . The goal . . . is to learn a representation that reveals intrinsic low-dimensional structure in data, disentangles underlying factors of variation by incorporating universal AI priors such as smoothness and sparsity, and is useful across multiple tasks and domains.” (Raman Arora) Embeddings and representation learning in general have been critical to the success of deep learning. Can we learn embeddings / representations without feature engineering (e.g., tokenization) and, if so, how? • OOV representations • beyond word embeddings • RNN/GRU/LSTM based embeddings • CNN based embeddings • multilingual embeddings, universal embeddings • noise • noncanonical language • characters vs bytes vs radicals vs bits • learning algorithms, segmentation • linking it back up to traditional linguistic units (e.g., words) • how is ambiguity represented? • numbers, named entities, multiwords and other nontypical units • form-function regularities: which form regularities (e.g., “add s at the end”) correspond to function regularities • cross-token modeling • char2vec (FastText?) • non-morphological character-level producitivity • typoglycemia • efficiency (character-based worse than word-based?) • inspection, interpretation, analysis, beyond black-box models • evaluation • applications • come up with 1-5 new research directions

Working Group End2End: End2End Architectures This WG will be concerned with the challenges that character-level models pose for machine learning. How can dependencies over long distances be learned? How can such models be made efficient in training and application? In an approach without feature- engineered preprocessing, how can domain knowledge and priors be incorporated into machine learning architectures? • CNNs vs RNNs: tradeoff speed/accuracy, parallel/sequential • hierarchical, multi-speed, multi-scale architectures – fixed small depth (2?) vs unbounded hierarchy (paragraph, document, book) • context: attention, memory, convolution etc. • which point in input to focus on • interface between character-level and higher-level (traditional?) processing layers (syntax, semantics) • multimodal / crossmodal End2End architectures • End2End learning of long-distance relationships: corresponding phrases in sentence pairs (or document pairs) • generation of OOVs • End2End segmentation learning (i.e., learn the right way to segment for an application) • how to put in domain / linguistic knowledge? • Bayesian models • in our big machine learning toolbox T , what are interesting t ∈ T to explore in combinations of the form “character-level + t ” • (add hot deep learning architecture of the day here) • efficiency (character-based worse than word-based?) • inspection, interpretation, analysis, beyond black-box models • evaluation • applications • come up with 1-5 new research directions

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, - PDF document

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, 2017 1 Working Group MORPH: Morphology This WG is concerned with morphology, one of the core areas of computational linguistics and theoretical linguistics, especially once

C2NLU: An Overview Heike Adel CIS, LMU Munich Dagstuhl January 23, 2017 C2NLU: An Overview

Seminar C2NLU, Schlo Dagstuhl, Wadern, Germany 24-January-2017 From Bayes Decision Rule to

Modelling Multiple Sequences: Explorations, Consequences and Challenges Orhan Firat Dagstuhl

3 babblings from bent dagstuhl 2017 1. grind-crunch 2. from 2 to 4 and back again 3. to share

Dagstuhl Workshop Quantum Cryptanalysis Schloss Dagstuhl / Leibniz-Zentrum fr Informatik,

Growth in permutation groups and linear New work on algebraic groups permutation groups H. A.

Coxeter groups and Artin groups Day 1: Polytopes and Reflection Groups Jon McCammond (U.C. Santa

Keep Teaching Groups in canvas Getting Started Groups in Canvas Webinar Topics we will cover in

Constructing non-positively curved spaces and groups Day 3: Artin groups and small-cancellation

Partial Groups and Homology Groups, Partial Groups, Homology, Topology The homology of a

Finite quotients of groups of I-type or Fabienne Quantum Yang-Baxter groups Chouraqui General

Symmetric cohomology of groups Mahender Singh Knots, Braids and Automorphism Groups Novosibirsk

Images of word maps in almost simple groups and quasisimple groups . . . . . Matthew Levy

Special geometry Simon G. Chiossi Special geometry with solvable Lie groups Lie groups

Property (T) for quantum groups from the Property (T) for groups dual point of view Quantum

Advanced Air Mobility Ecosystem Working Groups Virtual Kickoff AAM Ecosystem Working Groups

MALINDO Morph: Morphological dictionary and analyser for Malay/Indonesian 7 May 2018 Tie 13th

Comparative study between expert and non expert biomedical writings: their morphology and

Introduction to English Linguistics 3: Morphology and Word Formation Part I: Morphology Part II:

Root-letter priming in Maltese visual word recognition Jonathan Geary & Adam Ussishkin

Actuarial Science with 2. life insurance & mortality tables Arthur Charpentier joint work

Performance Measurement Work Group 3/15/17 Meeting QBR Updates QBR Updates: RY 2018 and RY 2019

Development MDGs to SDGs: the role of health Source: UN Change in under-5 mortality rate

Gompertz regression parameterized as accelerated failure time model Filip Andersson and Nicola

Sambuz

Useful Links

Newsletter

Mail Us

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, - PDF document

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, 2017 1 Working Group MORPH: Morphology This WG is concerned with morphology, one of the core areas of computational linguistics and theoretical linguistics, especially once

C2NLU: An Overview Heike Adel CIS, LMU Munich Dagstuhl January 23, 2017 C2NLU: An Overview

Seminar C2NLU, Schlo Dagstuhl, Wadern, Germany 24-January-2017 From Bayes Decision Rule to

Modelling Multiple Sequences: Explorations, Consequences and Challenges Orhan Firat Dagstuhl

3 babblings from bent dagstuhl 2017 1. grind-crunch 2. from 2 to 4 and back again 3. to share

Dagstuhl Workshop Quantum Cryptanalysis Schloss Dagstuhl / Leibniz-Zentrum fr Informatik,

Growth in permutation groups and linear New work on algebraic groups permutation groups H. A.

Coxeter groups and Artin groups Day 1: Polytopes and Reflection Groups Jon McCammond (U.C. Santa

Keep Teaching Groups in canvas Getting Started Groups in Canvas Webinar Topics we will cover in

Constructing non-positively curved spaces and groups Day 3: Artin groups and small-cancellation

Partial Groups and Homology Groups, Partial Groups, Homology, Topology The homology of a

Finite quotients of groups of I-type or Fabienne Quantum Yang-Baxter groups Chouraqui General

Symmetric cohomology of groups Mahender Singh Knots, Braids and Automorphism Groups Novosibirsk

Images of word maps in almost simple groups and quasisimple groups . . . . . Matthew Levy

Special geometry Simon G. Chiossi Special geometry with solvable Lie groups Lie groups

Property (T) for quantum groups from the Property (T) for groups dual point of view Quantum

Advanced Air Mobility Ecosystem Working Groups Virtual Kickoff AAM Ecosystem Working Groups

MALINDO Morph: Morphological dictionary and analyser for Malay/Indonesian 7 May 2018 Tie 13th

Comparative study between expert and non expert biomedical writings: their morphology and

Introduction to English Linguistics 3: Morphology and Word Formation Part I: Morphology Part II:

Root-letter priming in Maltese visual word recognition Jonathan Geary &amp; Adam Ussishkin

Actuarial Science with 2. life insurance &amp; mortality tables Arthur Charpentier joint work

Performance Measurement Work Group 3/15/17 Meeting QBR Updates QBR Updates: RY 2018 and RY 2019

Development MDGs to SDGs: the role of health Source: UN Change in under-5 mortality rate

Gompertz regression parameterized as accelerated failure time model Filip Andersson and Nicola

Sambuz

Useful Links

Newsletter

Mail Us

Root-letter priming in Maltese visual word recognition Jonathan Geary & Adam Ussishkin

Actuarial Science with 2. life insurance & mortality tables Arthur Charpentier joint work