FOUND IN TRANSLATION: Reconstructing Phylogenetic Language Trees - - PowerPoint PPT Presentation

found in translation
SMART_READER_LITE
LIVE PREVIEW

FOUND IN TRANSLATION: Reconstructing Phylogenetic Language Trees - - PowerPoint PPT Presentation

FOUND IN TRANSLATION: Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic Language Trees from Translations from Translations from


slide-1
SLIDE 1

ACL ACL ACL ACL 2017 2017 2017 2017, Vancouver , Vancouver , Vancouver , Vancouver

FOUND IN TRANSLATION:

Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic Language Trees from Translations from Translations from Translations from Translations

Ella Rabinovich Ella Rabinovich Ella Rabinovich Ella Rabinovich1,2

1,2 1,2 1,2, Noam Ordan

, Noam Ordan , Noam Ordan , Noam Ordan3

3 3 3,

, , , Shuly Shuly Shuly Shuly Wintner Wintner Wintner Wintner2

2 2 2

1 1 1 1IBM Research

IBM Research IBM Research IBM Research – – – – Haifa, Israel Haifa, Israel Haifa, Israel Haifa, Israel

2 2 2 2Department of Computer Science, University of Haifa, Israel

Department of Computer Science, University of Haifa, Israel Department of Computer Science, University of Haifa, Israel Department of Computer Science, University of Haifa, Israel

3 3 3 3The Arab College for Education, Haifa, Israel

The Arab College for Education, Haifa, Israel The Arab College for Education, Haifa, Israel The Arab College for Education, Haifa, Israel

slide-2
SLIDE 2

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 2

STARTING FROM THE END (spoiler )

Italian French Spanish German Dutch English Swedish Danish Romanian Lithuanian Portuguese Czech Slovak Bulgarian Latvian Polish Slovenian English Swedish Danish German Dutch Romanian French Italian Spanish Portuguese Latvian Lithuanian Polish Slovak Czech Slovenian Bulgarian

the Indo the Indo the Indo the Indo-

  • European phylogenetic tree

European phylogenetic tree European phylogenetic tree European phylogenetic tree (the “ground truth”) (the “ground truth”) (the “ground truth”) (the “ground truth”) phylogenetic tree reconstructed from phylogenetic tree reconstructed from phylogenetic tree reconstructed from phylogenetic tree reconstructed from monolingual English monolingual English monolingual English monolingual English texts translated texts translated texts translated texts translated from from from from 17 17 17 17 IE languages IE languages IE languages IE languages

slide-3
SLIDE 3

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 3

BACKGROUND – THE FEATURES OF TRANSLATIONESE

  • Translators

Translators Translators Translators (almost) (almost) (almost) (almost) always tried to always tried to always tried to always tried to remain remain remain remain invisible invisible invisible invisible

  • Translations have unique characteristics that set them apart from originals

Translations have unique characteristics that set them apart from originals Translations have unique characteristics that set them apart from originals Translations have unique characteristics that set them apart from originals

  • Universals (simplification, standardization,

Universals (simplification, standardization, Universals (simplification, standardization, Universals (simplification, standardization, explicitation explicitation explicitation explicitation) ) ) )

  • Interference (the “fingerprints” of a source language on the translation

Interference (the “fingerprints” of a source language on the translation Interference (the “fingerprints” of a source language on the translation Interference (the “fingerprints” of a source language on the translation product) product) product) product)

HYPOTHESIS

Languages Languages Languages Languages closer to each other closer to each other closer to each other closer to each other are likely to share more features in the target language of translation in the target language of translation in the target language of translation in the target language of translation The distance between languages The distance between languages The distance between languages The distance between languages is retained and is retained and is retained and is retained and can can can can be be be be recovered recovered recovered recovered when when when when assessed through these features in translated texts

slide-4
SLIDE 4

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 4

DATASET

  • Europarl (the proceedings of the European Parliament)
  • Members are allowed to speak in any of

Members are allowed to speak in any of Members are allowed to speak in any of Members are allowed to speak in any of the EU the EU the EU the EU languages languages languages languages

  • All parliament speeches were translated from the original language into
  • ther EU languages using English as a pivot
  • Direct

Direct Direct Direct translations into English, translations into English, translations into English, translations into English, indirect indirect indirect indirect translations into all other languages translations into all other languages translations into all other languages translations into all other languages

  • We explore

We explore We explore We explore indirect translations into French indirect translations into French indirect translations into French indirect translations into French in this work in this work in this work in this work

  • We focus on 17 source languages, grouped into 3 language families
  • Germanic, Romance, and Balto

Germanic, Romance, and Balto Germanic, Romance, and Balto Germanic, Romance, and Balto-

  • Slavic

Slavic Slavic Slavic

slide-5
SLIDE 5

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 5

RECONSTRUCTION OF LANGUAGE TREES

FEATURES USED

  • POS

POS POS POS-

  • trigrams, reflecting shallow syntactic

trigrams, reflecting shallow syntactic trigrams, reflecting shallow syntactic trigrams, reflecting shallow syntactic structures structures structures structures (strongly (strongly (strongly (strongly associated with associated with associated with associated with interference interference interference interference) ) ) )

  • Function words, reflecting grammar (associated with

Function words, reflecting grammar (associated with Function words, reflecting grammar (associated with Function words, reflecting grammar (associated with interference interference interference interference) ) ) )

  • Cohesive markers (associated with

Cohesive markers (associated with Cohesive markers (associated with Cohesive markers (associated with a a a a translation universals translation universals translation universals translation universals) ) ) )

AGGLOMERATIVE (HIERARCHICAL) CLUSTERING OF FEATURE VECTORS

  • Using the variance minimization algorithm (

Using the variance minimization algorithm ( Using the variance minimization algorithm ( Using the variance minimization algorithm (Ward, Ward, Ward, Ward, 1963 1963 1963 1963) ) ) ) → with Euclidean distance with Euclidean distance with Euclidean distance with Euclidean distance

slide-6
SLIDE 6

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 6

IDENTIFICATION OF TRANSLATIONESE AND ITS SOURCE LANGUAGE

Feature English translations French translations POS-trigrams 97.60 97.60 97.60 97.60 98.40 98.40 98.40 98.40 function words 96.45 96.45 96.45 96.45 95.15 95.15 95.15 95.15 cohesive markers 86.50 86.50 86.50 86.50 85.25 85.25 85.25 85.25

CONFUSION MATRIX source source source source-

  • language

language language language classification classification classification classification (POS (POS (POS (POS-

  • trigrams

trigrams trigrams trigrams) ) ) ) ORIGINAL VS. TRANSLATED binary binary binary binary classification classification classification classification

ENGLISH translations ( translations ( translations ( translations (76.5 76.5 76.5 76.5%) %) %) %) FRENCH translations ( translations ( translations ( translations (48.9 48.9 48.9 48.9%) %) %) %)

slide-7
SLIDE 7

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 7

RECONSTRUCTION OF LANGUAGE TREES

Phylogenetic language trees generated with translated text generated with translated text generated with translated text generated with translated text (POS (POS (POS (POS-

  • trigrams)

trigrams) trigrams) trigrams)

ENGLISH translations translations translations translations FRENCH translations translations translations translations

Italian French Spanish German Dutch English Swedish Danish Romanian Lithuanian Portuguese Czech Slovak Bulgarian Latvian Polish Slovenian Italian Spanish French German Swedish Dutch Danish English Slovak Lithuanian Latvian Bulgarian Romanian Slovenian Portuguese Polish Czech

slide-8
SLIDE 8

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 8

EVALUATION METHODOLOGY

MEASURE SIMILARITY TO THE GOLD STANDARD

UNWEIGHTED EVALUATION (CLADORGRAM) (CLADORGRAM) (CLADORGRAM) (CLADORGRAM)

assessing only structural assessing only structural assessing only structural assessing only structural (topological) similarity (topological) similarity (topological) similarity (topological) similarity

WEIGHTED EVALUATION (PHYLOGRAM) (PHYLOGRAM) (PHYLOGRAM) (PHYLOGRAM)

assessing similarity based on both assessing similarity based on both assessing similarity based on both assessing similarity based on both structure and branching length structure and branching length structure and branching length structure and branching length

CLADOGRAM PHYLOGRAM

A B C D A B C D

slide-9
SLIDE 9

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 9

EVALUATION METHODOLOGY – CONT.

  • Adaptation of the L2-norm to leaf-pair distance
  • Suitable for both weighted and unweighted evaluation

g – – – – the gold tree the gold tree the gold tree the gold tree t – – – – a tree subject to evaluation a tree subject to evaluation a tree subject to evaluation a tree subject to evaluation Dt(li, lj) – – – – distance between two leaves in a tree distance between two leaves in a tree distance between two leaves in a tree distance between two leaves in a tree

, =

  • , − ,

2

,∈ .. ;

A B C D

1 2 3 4

slide-10
SLIDE 10

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 10

EVALUATION RESULTS

UNWEIGHTED EVALUATION target language English English English English French French French French feature AVG AVG AVG AVG STD STD STD STD AVG AVG AVG AVG STD STD STD STD POS-trigrams + FW . . . .362 362 362 362 . . . .07 07 07 07 .367 . . . .06 06 06 06 POS-trigrams .353 . . . .06 06 06 06 . . . .399 399 399 399 . . . .08 08 08 08 Function words . . . .429 429 429 429 . . . .07 07 07 07 . . . .450 450 450 450 . . . .08 08 08 08 Cohesive markers . . . .626 626 626 626 . . . .16 16 16 16 . . . .678 678 678 678 . . . .14 14 14 14 Random tree . . . .724 724 724 724 . . . .07 07 07 07 . . . .724 724 724 724 . . . .07 07 07 07 WEIGHTED EVALUATION target language target language target language target language English English English English French French French French feature AVG AVG AVG AVG STD STD STD STD AVG AVG AVG AVG STD STD STD STD POS-trigrams + FW .278 . . . .03 03 03 03 .348 . . . .02 02 02 02 POS-trigrams . . . .301 301 301 301 . . . .03 03 03 03 . . . .351 351 351 351 . . . .03 03 03 03 Function words . . . .304 304 304 304 . . . .03 03 03 03 . . . .376 376 376 376 . . . .05 05 05 05 Cohesive markers . . . .598 598 598 598 . . . .12 12 12 12 . . . .636 636 636 636 . . . .07 07 07 07 Random tree . . . .676 676 676 676 . . . .10 10 10 10 . . . .676 676 676 676 . . . .10 10 10 10

DISTANCE OF A RECONSTRUCTED TREE FROM THE GOLD STANDARD

(using various feature sets) (using various feature sets) (using various feature sets) (using various feature sets)

trees trees trees trees built built built built from from from from English translations are translations are translations are translations are systematically closer to systematically closer to systematically closer to systematically closer to the gold the gold the gold the gold standard standard standard standard than than than than trees built from translations into trees built from translations into trees built from translations into trees built from translations into French ( ( ( (done via done via done via done via a third a third a third a third language) language) language) language) the quality of trees the quality of trees the quality of trees the quality of trees increases increases increases increases for feature sets associated for feature sets associated for feature sets associated for feature sets associated with with with with interference the worst tree is the worst tree is the worst tree is the worst tree is g g g ge e e en n n ne e e er r r ra a a at t t te e e ed d d d u u u us s s si i i in n n ng g g g c c c co

  • h

h h he e e es s s si i i iv v v ve e e e markers markers markers markers

slide-11
SLIDE 11

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 11

EVALUATION RESULTS

UNWEIGHTED EVALUATION target language English English English English French French French French feature AVG AVG AVG AVG STD STD STD STD AVG AVG AVG AVG STD STD STD STD POS-trigrams + FW . . . .362 362 362 362 . . . .07 07 07 07 .367 . . . .06 06 06 06 POS-trigrams .353 . . . .06 06 06 06 . . . .399 399 399 399 . . . .08 08 08 08 Function words . . . .429 429 429 429 . . . .07 07 07 07 . . . .450 450 450 450 . . . .08 08 08 08 Cohesive markers . . . .626 626 626 626 . . . .16 16 16 16 . . . .678 678 678 678 . . . .14 14 14 14 Random tree . . . .724 724 724 724 . . . .07 07 07 07 . . . .724 724 724 724 . . . .07 07 07 07 WEIGHTED EVALUATION target language target language target language target language English English English English French French French French feature AVG AVG AVG AVG STD STD STD STD AVG AVG AVG AVG STD STD STD STD POS-trigrams + FW .278 . . . .03 03 03 03 .348 . . . .02 02 02 02 POS-trigrams . . . .301 301 301 301 . . . .03 03 03 03 . . . .351 351 351 351 . . . .03 03 03 03 Function words . . . .304 304 304 304 . . . .03 03 03 03 . . . .376 376 376 376 . . . .05 05 05 05 Cohesive markers . . . .598 598 598 598 . . . .12 12 12 12 . . . .636 636 636 636 . . . .07 07 07 07 Random tree . . . .676 676 676 676 . . . .10 10 10 10 . . . .676 676 676 676 . . . .10 10 10 10

DISTANCE OF A RECONSTRUCTED TREE FROM THE GOLD STANDARD

(using various feature sets) (using various feature sets) (using various feature sets) (using various feature sets)

trees trees trees trees built built built built from from from from English translations are translations are translations are translations are systematically closer to systematically closer to systematically closer to systematically closer to the gold the gold the gold the gold standard standard standard standard than than than than trees built from translations into trees built from translations into trees built from translations into trees built from translations into French ( ( ( (done via done via done via done via a third a third a third a third language) language) language) language) the quality of trees the quality of trees the quality of trees the quality of trees increases increases increases increases for feature sets associated for feature sets associated for feature sets associated for feature sets associated with with with with interference the worst tree is the worst tree is the worst tree is the worst tree is g g g ge e e en n n ne e e er r r ra a a at t t te e e ed d d d u u u us s s si i i in n n ng g g g c c c co

  • h

h h he e e es s s si i i iv v v ve e e e markers markers markers markers

slide-12
SLIDE 12

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 12

EVALUATION RESULTS

UNWEIGHTED EVALUATION target language English English English English French French French French feature AVG AVG AVG AVG STD STD STD STD AVG AVG AVG AVG STD STD STD STD POS-trigrams + FW . . . .362 362 362 362 . . . .07 07 07 07 .367 . . . .06 06 06 06 POS-trigrams .353 . . . .06 06 06 06 . . . .399 399 399 399 . . . .08 08 08 08 Function words . . . .429 429 429 429 . . . .07 07 07 07 . . . .450 450 450 450 . . . .08 08 08 08 Cohesive markers . . . .626 626 626 626 . . . .16 16 16 16 . . . .678 678 678 678 . . . .14 14 14 14 Random tree . . . .724 724 724 724 . . . .07 07 07 07 . . . .724 724 724 724 . . . .07 07 07 07 WEIGHTED EVALUATION target language target language target language target language English English English English French French French French feature AVG AVG AVG AVG STD STD STD STD AVG AVG AVG AVG STD STD STD STD POS-trigrams + FW .278 . . . .03 03 03 03 .348 . . . .02 02 02 02 POS-trigrams . . . .301 301 301 301 . . . .03 03 03 03 . . . .351 351 351 351 . . . .03 03 03 03 Function words . . . .304 304 304 304 . . . .03 03 03 03 . . . .376 376 376 376 . . . .05 05 05 05 Cohesive markers . . . .598 598 598 598 . . . .12 12 12 12 . . . .636 636 636 636 . . . .07 07 07 07 Random tree . . . .676 676 676 676 . . . .10 10 10 10 . . . .676 676 676 676 . . . .10 10 10 10

DISTANCE OF A RECONSTRUCTED TREE FROM THE GOLD STANDARD

(using various feature sets) (using various feature sets) (using various feature sets) (using various feature sets)

trees trees trees trees built built built built from from from from English translations are translations are translations are translations are systematically closer to systematically closer to systematically closer to systematically closer to the gold the gold the gold the gold standard standard standard standard than than than than trees built from translations into trees built from translations into trees built from translations into trees built from translations into French ( ( ( (done via done via done via done via a third a third a third a third language) language) language) language) the quality of trees the quality of trees the quality of trees the quality of trees increases increases increases increases for feature sets associated for feature sets associated for feature sets associated for feature sets associated with with with with interference the worst tree is the worst tree is the worst tree is the worst tree is g g g ge e e en n n ne e e er r r ra a a at t t te e e ed d d d u u u us s s si i i in n n ng g g g c c c co

  • h

h h he e e es s s si i i iv v v ve e e e markers markers markers markers

slide-13
SLIDE 13

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 13

EVALUATION RESULTS

UNWEIGHTED EVALUATION target language English English English English French French French French feature AVG AVG AVG AVG STD STD STD STD AVG AVG AVG AVG STD STD STD STD POS-trigrams + FW . . . .362 362 362 362 . . . .07 07 07 07 .367 . . . .06 06 06 06 POS-trigrams .353 . . . .06 06 06 06 . . . .399 399 399 399 . . . .08 08 08 08 Function words . . . .429 429 429 429 . . . .07 07 07 07 . . . .450 450 450 450 . . . .08 08 08 08 Cohesive markers . . . .626 626 626 626 . . . .16 16 16 16 . . . .678 678 678 678 . . . .14 14 14 14 Random tree . . . .724 724 724 724 . . . .07 07 07 07 . . . .724 724 724 724 . . . .07 07 07 07 WEIGHTED EVALUATION target language target language target language target language English English English English French French French French feature AVG AVG AVG AVG STD STD STD STD AVG AVG AVG AVG STD STD STD STD POS-trigrams + FW .278 . . . .03 03 03 03 .348 . . . .02 02 02 02 POS-trigrams . . . .301 301 301 301 . . . .03 03 03 03 . . . .351 351 351 351 . . . .03 03 03 03 Function words . . . .304 304 304 304 . . . .03 03 03 03 . . . .376 376 376 376 . . . .05 05 05 05 Cohesive markers . . . .598 598 598 598 . . . .12 12 12 12 . . . .636 636 636 636 . . . .07 07 07 07 Random tree . . . .676 676 676 676 . . . .10 10 10 10 . . . .676 676 676 676 . . . .10 10 10 10

DISTANCE OF A RECONSTRUCTED TREE FROM THE GOLD STANDARD

(using various feature sets) (using various feature sets) (using various feature sets) (using various feature sets)

trees trees trees trees built built built built from from from from English translations are translations are translations are translations are systematically closer to systematically closer to systematically closer to systematically closer to the gold the gold the gold the gold standard standard standard standard than than than than trees built from translations into trees built from translations into trees built from translations into trees built from translations into French ( ( ( (done via done via done via done via a third a third a third a third language) language) language) language) the quality of trees the quality of trees the quality of trees the quality of trees increases increases increases increases for feature sets associated for feature sets associated for feature sets associated for feature sets associated with with with with interference the worst tree is the worst tree is the worst tree is the worst tree is g g g ge e e en n n ne e e er r r ra a a at t t te e e ed d d d u u u us s s si i i in n n ng g g g c c c co

  • h

h h he e e es s s si i i iv v v ve e e e markers markers markers markers

slide-14
SLIDE 14

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 14

ANALYSIS

Articles

  • Indefinite

Indefinite Indefinite Indefinite (“a”, “an”) and d (“a”, “an”) and d (“a”, “an”) and d (“a”, “an”) and definite efinite efinite efinite (“the”) (“the”) (“the”) (“the”) Possessive constructions

  • With clitic

With clitic With clitic With clitic ‘s ‘s ‘s ‘s (“the guest’s room”) (“the guest’s room”) (“the guest’s room”) (“the guest’s room”)

  • With a prepositional phrase containing

With a prepositional phrase containing With a prepositional phrase containing With a prepositional phrase containing “of” “of” “of” “of” (“the room of the (“the room of the (“the room of the (“the room of the guest”) guest”) guest”) guest”)

  • With noun compounds (“guest room”)

With noun compounds (“guest room”) With noun compounds (“guest room”) With noun compounds (“guest room”) Verb-particle constructions

  • Verbs that combine with a particle to create a new meaning (MWEs),

Verbs that combine with a particle to create a new meaning (MWEs), Verbs that combine with a particle to create a new meaning (MWEs), Verbs that combine with a particle to create a new meaning (MWEs), e.g., “turn down”, “get over” e.g., “turn down”, “get over” e.g., “turn down”, “get over” e.g., “turn down”, “get over” Tense and aspect

  • With the auxiliary verbs “have” (present) or “be” (progressive),

With the auxiliary verbs “have” (present) or “be” (progressive), With the auxiliary verbs “have” (present) or “be” (progressive), With the auxiliary verbs “have” (present) or “be” (progressive), e.g., “have done”, “was going” e.g., “have done”, “was going” e.g., “have done”, “was going” e.g., “have done”, “was going”

slide-15
SLIDE 15

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 15

ANALYSIS – CONT.

FREQUENCIES reflecting various linguistic phenomena in English translations reflecting various linguistic phenomena in English translations reflecting various linguistic phenomena in English translations reflecting various linguistic phenomena in English translations

define articles define articles define articles define articles (per 10 tokens) (per 10 tokens) (per 10 tokens) (per 10 tokens)

  • f' constructions
  • f' constructions
  • f' constructions
  • f' constructions

(per 25 tokens) (per 25 tokens) (per 25 tokens) (per 25 tokens) verb-particle verb-particle verb-particle verb-particle (per 250 tokens) (per 250 tokens) (per 250 tokens) (per 250 tokens) perfect perfect perfect perfect (per 100 tokens) (per 100 tokens) (per 100 tokens) (per 100 tokens) progressive progressive progressive progressive (per 500 tokens) (per 500 tokens) (per 500 tokens) (per 500 tokens) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Germanic Germanic Germanic Germanic Romance Romance Romance Romance Balto-Slavic Balto-Slavic Balto-Slavic Balto-Slavic

Frequency Frequency Frequency Frequency

slide-16
SLIDE 16

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 16

SUMMARY

Translation does not distorts the original text randomly A phylogenetic language tree can be reconstructed from monolingual texts translated from various languages Translations impact the evolution of languages

  • It is estimated that for

It is estimated that for It is estimated that for It is estimated that for certain languages up to certain languages up to certain languages up to certain languages up to 30 30 30 30% of % of % of % of published texts published texts published texts published texts are are are are mediated through translations mediated through translations mediated through translations mediated through translations ( ( ( (Pym and Pym and Pym and Pym and Chrupała Chrupała Chrupała Chrupała, , , , 2005 2005 2005 2005) ) ) ) Are translations likely to play a role in language change? Features associated with interference (POS-ngrams, FWs) yield more accurate phylogenetic language trees

slide-17
SLIDE 17

FOUND IN TRANSLATION: RECONSTRUCTING PHYLOGENETIC LANGUAGE TREES FROM TRANSLATIONS AUG 2017 17

STARTING FROM THE END (spoiler )

Italian French Spanish German Dutch English Swedish Danish Romanian Lithuanian Portuguese Czech Slovak Bulgarian Latvian Polish Slovenian Italian Spanish French German Swedish Dutch Danish English Slovak Lithuanian Latvian Bulgarian Romanian Slovenian Portuguese Polish Czech

phylogenetic tree reconstructed from phylogenetic tree reconstructed from phylogenetic tree reconstructed from phylogenetic tree reconstructed from monolingual English monolingual English monolingual English monolingual English texts translated texts translated texts translated texts translated from from from from 17 17 17 17 IE languages IE languages IE languages IE languages phylogenetic tree reconstructed from phylogenetic tree reconstructed from phylogenetic tree reconstructed from phylogenetic tree reconstructed from monolingual French monolingual French monolingual French monolingual French texts translated texts translated texts translated texts translated indirectly indirectly indirectly indirectly from from from from 17 17 17 17 IE languages via IE languages via IE languages via IE languages via English pivot English pivot English pivot English pivot