News from dialektkarten.ch Yves Scherrer Department of Digital - - PowerPoint PPT Presentation

news from dialektkarten ch
SMART_READER_LITE
LIVE PREVIEW

News from dialektkarten.ch Yves Scherrer Department of Digital - - PowerPoint PPT Presentation

News from dialektkarten.ch Yves Scherrer Department of Digital Humanities, University of Helsinki 2. VerbaAlpina-Arbeitstagung, Munich, 18 June 2019 1 2008 2012: Where it all started 2 B : Modern High German (Standard German)


slide-1
SLIDE 1

News from dialektkarten.ch

Yves Scherrer Department of Digital Humanities, University of Helsinki

  • 2. VerbaAlpina-Arbeitstagung, Munich, 18 June 2019

1

slide-2
SLIDE 2

2008 – 2012: Where it all started

slide-3
SLIDE 3

2008 – 2012: Standard → dialect machine translation

Generative dialectology (Veith 1970, 1982)

  • Transformation rules derive a multitude of dialect

systems Di from a single reference system B:

  • #Töpfer#B → #Häfner#D33333−46999

My proposal:

  • D: Swiss German dialects
  • B: Modern High German (“Standard German”)
  • Most practical, but not historically correct
  • Dialects are not represented as discrete numbered

entities, but as probability maps

  • immer
  • StdG
  • → geng

2

slide-4
SLIDE 4

2008 – 2012: Standard → dialect machine translation

Generative dialectology (Veith 1970, 1982)

  • Transformation rules derive a multitude of dialect

systems Di from a single reference system B:

  • #Töpfer#B → #Häfner#D33333−46999

My proposal:

  • D: Swiss German dialects
  • B: Modern High German (“Standard German”)
  • Most practical, but not historically correct
  • Dialects are not represented as discrete numbered

entities, but as probability maps

  • immer
  • StdG
  • → geng
  • 2
slide-5
SLIDE 5

2008 – 2012: Standard → dialect machine translation

Probability maps are extracted from the Sprachatlas der deutschen Schweiz (SDS):

  • 1. Scan original maps
  • 2. Digitize
  • 3. Interpolate (Kernel

density estimation)

3

slide-6
SLIDE 6

2008 – 2012: Standard → dialect machine translation

Probability maps are extracted from the Sprachatlas der deutschen Schweiz (SDS):

  • 1. Scan original maps
  • 2. Digitize
  • 3. Interpolate (Kernel

density estimation)

3

slide-7
SLIDE 7

2008 – 2012: Standard → dialect machine translation

Probability maps are extracted from the Sprachatlas der deutschen Schweiz (SDS):

  • 1. Scan original maps
  • 2. Digitize
  • 3. Interpolate (Kernel

density estimation)

3

slide-8
SLIDE 8

2008 – 2012: Standard → dialect machine translation

Probability maps are extracted from the Sprachatlas der deutschen Schweiz (SDS):

  • 1. Scan original maps
  • 2. Digitize
  • 3. Interpolate (Kernel

density estimation)

3

slide-9
SLIDE 9

2010: Interactive online maps

Numérisation Exportation Interpolation Exportation Toutes les opérations effectuées avec le système d’informations géographiques ArcMap.

  • Y. Scherrer (2010): Des cartes dialectologiques numérisées pour le TALN.

Proceedings of TALN, Montréal. 4

slide-10
SLIDE 10

2010: Interactive online demonstrators

Machine translation Dialect identifjcation

Texte allemand standard Coordonnées du dialecte-cible Lemmatisation/étiquetage/ analyse syntaxique Texte allemand standard analysé Règles de transfert phonétiques/lexicales/syntaxiques Texte dialectal Lexique allemand standard avec annotations morphologiques Règles de transfert phonétiques/lexicales Lexique dialectal avec annotations morphologiques

Chaque entrée est associée à une carte.

Texte dialectal Consultation du lexique et combinaison des cartes Carte probabiliste

  • Y. Scherrer (2010): Des cartes dialectologiques numérisées pour le TALN.

Proceedings of TALN, Montréal. 5

slide-11
SLIDE 11

2010: Interactive online maps

Map rendering engine: Google Maps

6

slide-12
SLIDE 12

2012 – 2016: What happened later

slide-13
SLIDE 13

2012 – 2016: Dialectometry

2011–2012 First experiments with dialectometry

  • Y. Scherrer (2014): Computerlinguistische Experimente für die schweizerdeutsche

Dialektlandschaft: Maschinelle Übersetzung und Dialektometrie. In D. Huck (Ed.) Alemannische Dialektologie: Dialekte im Kontakt (Beiträge zur 17. Arbeitstagung für alemannische Dialektologie in Strassburg), ZDL Beihefte 155, 261-278. Stuttgart: Franz Steiner Verlag.

2012–2013 Collaboration with H. Goebl (Salzburg) to provide dialectometric computations and visualisations

  • H. Goebl, Y. Scherrer & P. Smečka (2013): Kurzbericht über die Dialektometrisierung

des Gesamtnetzes des „Sprachatlasses der deutschen Schweiz“ (SDS). In K. Schneider-Wiejowski, B. Kellermeier-Rehbein & J. Haselhuber (Ed.) Vielfalt, Variation und Stellung der deutschen Sprache. Berlin, Boston: De Gruyter Mouton, 153-176. V1 7

slide-14
SLIDE 14

2012 – 2016: Dialectometry

2011–2012 First experiments with dialectometry

  • Y. Scherrer (2014): Computerlinguistische Experimente für die schweizerdeutsche

Dialektlandschaft: Maschinelle Übersetzung und Dialektometrie. In D. Huck (Ed.) Alemannische Dialektologie: Dialekte im Kontakt (Beiträge zur 17. Arbeitstagung für alemannische Dialektologie in Strassburg), ZDL Beihefte 155, 261-278. Stuttgart: Franz Steiner Verlag.

2012–2013 Collaboration with H. Goebl (Salzburg) to provide dialectometric computations and visualisations

  • H. Goebl, Y. Scherrer & P. Smečka (2013): Kurzbericht über die Dialektometrisierung

des Gesamtnetzes des „Sprachatlasses der deutschen Schweiz“ (SDS). In K. Schneider-Wiejowski, B. Kellermeier-Rehbein & J. Haselhuber (Ed.) Vielfalt, Variation und Stellung der deutschen Sprache. Berlin, Boston: De Gruyter Mouton, 153-176. V1 7

slide-15
SLIDE 15

2012 – 2016: Dialectometry

2013–2014 MA thesis at University of Zurich: main focus on dialectometry, includes additional digitized SDS maps and syntax data from the SADS project

  • S. Kellerhals (2014): Dialektometrische Analyse und Visualisierung von

schweizerdeutschen Dialekten auf verschiedenen linguistischen Ebenen. Masterarbeit, Geographisches Institut der Universität Zürich. V2

2013 Interactive online interface for dialectometrical visualisations 2014 Domain www.dialektkarten.ch goes live

8

slide-16
SLIDE 16

2012 – 2016: Dialectometry

2013–2014 MA thesis at University of Zurich: main focus on dialectometry, includes additional digitized SDS maps and syntax data from the SADS project

  • S. Kellerhals (2014): Dialektometrische Analyse und Visualisierung von

schweizerdeutschen Dialekten auf verschiedenen linguistischen Ebenen. Masterarbeit, Geographisches Institut der Universität Zürich. V2

2013 Interactive online interface for dialectometrical visualisations 2014 Domain www.dialektkarten.ch goes live

8

slide-17
SLIDE 17

2012 – 2016: Dialectometry

9

slide-18
SLIDE 18

2012 – 2016: Dialectometry

2015–2016 Continuation of dialectometry experiments with

  • P. Stoeckle: additional SDS maps, new selection of

SADS material

  • Y. Scherrer & P. Stoeckle (2016): A quantitative approach to Swiss German –

Dialectometric analyses and comparisons of linguistic levels. Dialectologia et Geolinguistica, 24(1), 92-125. V3

20 40 60 80 100 120 Phonology (SDS I–II) Morphology (SDS III) Lexicon (SDS IV–VIII) 27 2 28 9 11 64 105 36 V1 (2013) V2 (2014) V3 (2016)

The V3 maps did not make it to the online version… 2017 First contacts with H. Goebl to replace Java-based visualisation solution by Dialektkarten backend

10

slide-19
SLIDE 19

2012 – 2016: Dialectometry

2015–2016 Continuation of dialectometry experiments with

  • P. Stoeckle: additional SDS maps, new selection of

SADS material

  • Y. Scherrer & P. Stoeckle (2016): A quantitative approach to Swiss German –

Dialectometric analyses and comparisons of linguistic levels. Dialectologia et Geolinguistica, 24(1), 92-125. V3

20 40 60 80 100 120 Phonology (SDS I–II) Morphology (SDS III) Lexicon (SDS IV–VIII) 27 2 28 9 11 64 105 36 V1 (2013) V2 (2014) V3 (2016)

The V3 maps did not make it to the online version… 2017 First contacts with H. Goebl to replace Java-based visualisation solution by Dialektkarten backend

10

slide-20
SLIDE 20

2012 – 2016: Dialectometry

2015–2016 Continuation of dialectometry experiments with

  • P. Stoeckle: additional SDS maps, new selection of

SADS material

  • Y. Scherrer & P. Stoeckle (2016): A quantitative approach to Swiss German –

Dialectometric analyses and comparisons of linguistic levels. Dialectologia et Geolinguistica, 24(1), 92-125. V3

20 40 60 80 100 120 Phonology (SDS I–II) Morphology (SDS III) Lexicon (SDS IV–VIII) 27 2 28 9 11 64 105 36 V1 (2013) V2 (2014) V3 (2016)

The V3 maps did not make it to the online version… 2017 First contacts with H. Goebl to replace Java-based visualisation solution by Dialektkarten backend

10

slide-21
SLIDE 21

2012 – 2016: Dialectometry

2015–2016 Continuation of dialectometry experiments with

  • P. Stoeckle: additional SDS maps, new selection of

SADS material

  • Y. Scherrer & P. Stoeckle (2016): A quantitative approach to Swiss German –

Dialectometric analyses and comparisons of linguistic levels. Dialectologia et Geolinguistica, 24(1), 92-125. V3

20 40 60 80 100 120 Phonology (SDS I–II) Morphology (SDS III) Lexicon (SDS IV–VIII) 27 2 28 9 11 64 105 36 V1 (2013) V2 (2014) V3 (2016)

The V3 maps did not make it to the online version… 2017 First contacts with H. Goebl to replace Java-based visualisation solution by Dialektkarten backend

10

slide-22
SLIDE 22

2018 – 2019: The real news

slide-23
SLIDE 23

2018 – 2019: Layout and backend changes

July 2018 www.dialektkarten.ch breaks due to Google Maps changes of terms of service

  • Jan. 2019 Comprehensive update with mostly under-the-hood

changes:

  • Three applications with same layout: map viewer,

translation demonstrator, dialectometry viewer

  • Change from Google Maps API to Leafmet API with

Stamen maps (based on OSM)

  • Change from KML (containing style) to GeoJSON (no

style)

  • New architecture to accommodate different projects

using the same scripts

11

slide-24
SLIDE 24

2018 – 2019: Layout and backend changes

July 2018 www.dialektkarten.ch breaks due to Google Maps changes of terms of service

  • Jan. 2019 Comprehensive update with mostly under-the-hood

changes:

  • Three applications with same layout: map viewer,

translation demonstrator, dialectometry viewer

  • Change from Google Maps API to Leafmet API with

Stamen maps (based on OSM)

  • Change from KML (containing style) to GeoJSON (no

style)

  • New architecture to accommodate different projects

using the same scripts

11

slide-25
SLIDE 25

2018 – 2019: Layout and backend changes

July 2018 www.dialektkarten.ch breaks due to Google Maps changes of terms of service

  • Jan. 2019 Comprehensive update with mostly under-the-hood

changes:

  • Three applications with same layout: map viewer,

translation demonstrator, dialectometry viewer

  • Change from Google Maps API to Leafmet API with

Stamen maps (based on OSM)

  • Change from KML (containing style) to GeoJSON (no

style)

  • New architecture to accommodate different projects

using the same scripts

11

slide-26
SLIDE 26

2018 – 2019: Layout and backend changes

July 2018 www.dialektkarten.ch breaks due to Google Maps changes of terms of service

  • Jan. 2019 Comprehensive update with mostly under-the-hood

changes:

  • Three applications with same layout: map viewer,

translation demonstrator, dialectometry viewer

  • Change from Google Maps API to Leafmet API with

Stamen maps (based on OSM)

  • Change from KML (containing style) to GeoJSON (no

style)

  • New architecture to accommodate different projects

using the same scripts

11

slide-27
SLIDE 27

2018 – 2019: Layout and backend changes

12

slide-28
SLIDE 28

2018 – 2019: Layout and backend changes

13

slide-29
SLIDE 29

2018 – 2019: Layout and backend changes

14

slide-30
SLIDE 30

2018 – 2019: Addressing the backlog

June 2019 Addition of dialectometry maps from the Scherrer & Stoeckle (2016) paper (V3)

  • New import procedure as these experiments were

carried out without Goebl’s VDM application

June 2019 New subprojects for ALF, AIS and SED, based on

  • H. Goebl’s material

15

slide-31
SLIDE 31

2018 – 2019: Addressing the backlog

June 2019 Addition of dialectometry maps from the Scherrer & Stoeckle (2016) paper (V3)

  • New import procedure as these experiments were

carried out without Goebl’s VDM application

June 2019 New subprojects for ALF, AIS and SED, based on

  • H. Goebl’s material

15

slide-32
SLIDE 32

2018 – 2019: New AIS subproject

16

slide-33
SLIDE 33

2018 – 2019: New ALF subproject

17

slide-34
SLIDE 34

2018 – 2019: New SED subproject

18

slide-35
SLIDE 35

2018 – 2019: Coming up

General:

  • Server change (hopefully transparent)
  • Decision on the future of the translation and

identifjcation demonstrator pages

  • Long-term goal: add crowd-sourcing and corpus projects

Swiss German:

  • Add single feature maps of V3 material, update download

links (only dialectometrical visualisations are online now) Italian / French / English:

  • Move to dedicated server
  • Multilingual versions of interface (DE, EN, FR, IT)
  • Additional visualisations: correlation maps,

single feature maps (?)

19

slide-36
SLIDE 36

2018 – 2019: Coming up

General:

  • Server change (hopefully transparent)
  • Decision on the future of the translation and

identifjcation demonstrator pages

  • Long-term goal: add crowd-sourcing and corpus projects

Swiss German:

  • Add single feature maps of V3 material, update download

links (only dialectometrical visualisations are online now) Italian / French / English:

  • Move to dedicated server
  • Multilingual versions of interface (DE, EN, FR, IT)
  • Additional visualisations: correlation maps,

single feature maps (?)

19

slide-37
SLIDE 37

2018 – 2019: Coming up

General:

  • Server change (hopefully transparent)
  • Decision on the future of the translation and

identifjcation demonstrator pages

  • Long-term goal: add crowd-sourcing and corpus projects

Swiss German:

  • Add single feature maps of V3 material, update download

links (only dialectometrical visualisations are online now) Italian / French / English:

  • Move to dedicated server
  • Multilingual versions of interface (DE, EN, FR, IT)
  • Additional visualisations: correlation maps,

single feature maps (?)

19

slide-38
SLIDE 38

Technical details

  • All online visualisations are based on static GeoJSON and

dynamically generated JSON fjles

  • The browser console lists all fjles as they are loaded
  • Source code is on a private GitHub repository, but can be

made available

20

slide-39
SLIDE 39

Conclusions

Have a look at www.dialektkarten.ch !

And let me know if there is anything that does not meet your expectations yves.scherrer@helsinki.fi

21

slide-40
SLIDE 40

Conclusions

Have a look at www.dialektkarten.ch !

And let me know if there is anything that does not meet your expectations yves.scherrer@helsinki.fi

21