SLIDE 1 News from dialektkarten.ch
Yves Scherrer Department of Digital Humanities, University of Helsinki
- 2. VerbaAlpina-Arbeitstagung, Munich, 18 June 2019
1
SLIDE 2
2008 – 2012: Where it all started
SLIDE 3 2008 – 2012: Standard → dialect machine translation
Generative dialectology (Veith 1970, 1982)
- Transformation rules derive a multitude of dialect
systems Di from a single reference system B:
- #Töpfer#B → #Häfner#D33333−46999
My proposal:
- D: Swiss German dialects
- B: Modern High German (“Standard German”)
- Most practical, but not historically correct
- Dialects are not represented as discrete numbered
entities, but as probability maps
2
SLIDE 4 2008 – 2012: Standard → dialect machine translation
Generative dialectology (Veith 1970, 1982)
- Transformation rules derive a multitude of dialect
systems Di from a single reference system B:
- #Töpfer#B → #Häfner#D33333−46999
My proposal:
- D: Swiss German dialects
- B: Modern High German (“Standard German”)
- Most practical, but not historically correct
- Dialects are not represented as discrete numbered
entities, but as probability maps
SLIDE 5 2008 – 2012: Standard → dialect machine translation
Probability maps are extracted from the Sprachatlas der deutschen Schweiz (SDS):
- 1. Scan original maps
- 2. Digitize
- 3. Interpolate (Kernel
density estimation)
3
SLIDE 6 2008 – 2012: Standard → dialect machine translation
Probability maps are extracted from the Sprachatlas der deutschen Schweiz (SDS):
- 1. Scan original maps
- 2. Digitize
- 3. Interpolate (Kernel
density estimation)
3
SLIDE 7 2008 – 2012: Standard → dialect machine translation
Probability maps are extracted from the Sprachatlas der deutschen Schweiz (SDS):
- 1. Scan original maps
- 2. Digitize
- 3. Interpolate (Kernel
density estimation)
3
SLIDE 8 2008 – 2012: Standard → dialect machine translation
Probability maps are extracted from the Sprachatlas der deutschen Schweiz (SDS):
- 1. Scan original maps
- 2. Digitize
- 3. Interpolate (Kernel
density estimation)
3
SLIDE 9 2010: Interactive online maps
Numérisation Exportation Interpolation Exportation Toutes les opérations effectuées avec le système d’informations géographiques ArcMap.
- Y. Scherrer (2010): Des cartes dialectologiques numérisées pour le TALN.
Proceedings of TALN, Montréal. 4
SLIDE 10 2010: Interactive online demonstrators
Machine translation Dialect identifjcation
Texte allemand standard Coordonnées du dialecte-cible Lemmatisation/étiquetage/ analyse syntaxique Texte allemand standard analysé Règles de transfert phonétiques/lexicales/syntaxiques Texte dialectal Lexique allemand standard avec annotations morphologiques Règles de transfert phonétiques/lexicales Lexique dialectal avec annotations morphologiques
Chaque entrée est associée à une carte.
Texte dialectal Consultation du lexique et combinaison des cartes Carte probabiliste
- Y. Scherrer (2010): Des cartes dialectologiques numérisées pour le TALN.
Proceedings of TALN, Montréal. 5
SLIDE 11
2010: Interactive online maps
Map rendering engine: Google Maps
6
SLIDE 12
2012 – 2016: What happened later
SLIDE 13 2012 – 2016: Dialectometry
2011–2012 First experiments with dialectometry
- Y. Scherrer (2014): Computerlinguistische Experimente für die schweizerdeutsche
Dialektlandschaft: Maschinelle Übersetzung und Dialektometrie. In D. Huck (Ed.) Alemannische Dialektologie: Dialekte im Kontakt (Beiträge zur 17. Arbeitstagung für alemannische Dialektologie in Strassburg), ZDL Beihefte 155, 261-278. Stuttgart: Franz Steiner Verlag.
2012–2013 Collaboration with H. Goebl (Salzburg) to provide dialectometric computations and visualisations
- H. Goebl, Y. Scherrer & P. Smečka (2013): Kurzbericht über die Dialektometrisierung
des Gesamtnetzes des „Sprachatlasses der deutschen Schweiz“ (SDS). In K. Schneider-Wiejowski, B. Kellermeier-Rehbein & J. Haselhuber (Ed.) Vielfalt, Variation und Stellung der deutschen Sprache. Berlin, Boston: De Gruyter Mouton, 153-176. V1 7
SLIDE 14 2012 – 2016: Dialectometry
2011–2012 First experiments with dialectometry
- Y. Scherrer (2014): Computerlinguistische Experimente für die schweizerdeutsche
Dialektlandschaft: Maschinelle Übersetzung und Dialektometrie. In D. Huck (Ed.) Alemannische Dialektologie: Dialekte im Kontakt (Beiträge zur 17. Arbeitstagung für alemannische Dialektologie in Strassburg), ZDL Beihefte 155, 261-278. Stuttgart: Franz Steiner Verlag.
2012–2013 Collaboration with H. Goebl (Salzburg) to provide dialectometric computations and visualisations
- H. Goebl, Y. Scherrer & P. Smečka (2013): Kurzbericht über die Dialektometrisierung
des Gesamtnetzes des „Sprachatlasses der deutschen Schweiz“ (SDS). In K. Schneider-Wiejowski, B. Kellermeier-Rehbein & J. Haselhuber (Ed.) Vielfalt, Variation und Stellung der deutschen Sprache. Berlin, Boston: De Gruyter Mouton, 153-176. V1 7
SLIDE 15 2012 – 2016: Dialectometry
2013–2014 MA thesis at University of Zurich: main focus on dialectometry, includes additional digitized SDS maps and syntax data from the SADS project
- S. Kellerhals (2014): Dialektometrische Analyse und Visualisierung von
schweizerdeutschen Dialekten auf verschiedenen linguistischen Ebenen. Masterarbeit, Geographisches Institut der Universität Zürich. V2
2013 Interactive online interface for dialectometrical visualisations 2014 Domain www.dialektkarten.ch goes live
8
SLIDE 16 2012 – 2016: Dialectometry
2013–2014 MA thesis at University of Zurich: main focus on dialectometry, includes additional digitized SDS maps and syntax data from the SADS project
- S. Kellerhals (2014): Dialektometrische Analyse und Visualisierung von
schweizerdeutschen Dialekten auf verschiedenen linguistischen Ebenen. Masterarbeit, Geographisches Institut der Universität Zürich. V2
2013 Interactive online interface for dialectometrical visualisations 2014 Domain www.dialektkarten.ch goes live
8
SLIDE 17
2012 – 2016: Dialectometry
9
SLIDE 18 2012 – 2016: Dialectometry
2015–2016 Continuation of dialectometry experiments with
- P. Stoeckle: additional SDS maps, new selection of
SADS material
- Y. Scherrer & P. Stoeckle (2016): A quantitative approach to Swiss German –
Dialectometric analyses and comparisons of linguistic levels. Dialectologia et Geolinguistica, 24(1), 92-125. V3
20 40 60 80 100 120 Phonology (SDS I–II) Morphology (SDS III) Lexicon (SDS IV–VIII) 27 2 28 9 11 64 105 36 V1 (2013) V2 (2014) V3 (2016)
The V3 maps did not make it to the online version… 2017 First contacts with H. Goebl to replace Java-based visualisation solution by Dialektkarten backend
10
SLIDE 19 2012 – 2016: Dialectometry
2015–2016 Continuation of dialectometry experiments with
- P. Stoeckle: additional SDS maps, new selection of
SADS material
- Y. Scherrer & P. Stoeckle (2016): A quantitative approach to Swiss German –
Dialectometric analyses and comparisons of linguistic levels. Dialectologia et Geolinguistica, 24(1), 92-125. V3
20 40 60 80 100 120 Phonology (SDS I–II) Morphology (SDS III) Lexicon (SDS IV–VIII) 27 2 28 9 11 64 105 36 V1 (2013) V2 (2014) V3 (2016)
The V3 maps did not make it to the online version… 2017 First contacts with H. Goebl to replace Java-based visualisation solution by Dialektkarten backend
10
SLIDE 20 2012 – 2016: Dialectometry
2015–2016 Continuation of dialectometry experiments with
- P. Stoeckle: additional SDS maps, new selection of
SADS material
- Y. Scherrer & P. Stoeckle (2016): A quantitative approach to Swiss German –
Dialectometric analyses and comparisons of linguistic levels. Dialectologia et Geolinguistica, 24(1), 92-125. V3
20 40 60 80 100 120 Phonology (SDS I–II) Morphology (SDS III) Lexicon (SDS IV–VIII) 27 2 28 9 11 64 105 36 V1 (2013) V2 (2014) V3 (2016)
The V3 maps did not make it to the online version… 2017 First contacts with H. Goebl to replace Java-based visualisation solution by Dialektkarten backend
10
SLIDE 21 2012 – 2016: Dialectometry
2015–2016 Continuation of dialectometry experiments with
- P. Stoeckle: additional SDS maps, new selection of
SADS material
- Y. Scherrer & P. Stoeckle (2016): A quantitative approach to Swiss German –
Dialectometric analyses and comparisons of linguistic levels. Dialectologia et Geolinguistica, 24(1), 92-125. V3
20 40 60 80 100 120 Phonology (SDS I–II) Morphology (SDS III) Lexicon (SDS IV–VIII) 27 2 28 9 11 64 105 36 V1 (2013) V2 (2014) V3 (2016)
The V3 maps did not make it to the online version… 2017 First contacts with H. Goebl to replace Java-based visualisation solution by Dialektkarten backend
10
SLIDE 22
2018 – 2019: The real news
SLIDE 23 2018 – 2019: Layout and backend changes
July 2018 www.dialektkarten.ch breaks due to Google Maps changes of terms of service
- Jan. 2019 Comprehensive update with mostly under-the-hood
changes:
- Three applications with same layout: map viewer,
translation demonstrator, dialectometry viewer
- Change from Google Maps API to Leafmet API with
Stamen maps (based on OSM)
- Change from KML (containing style) to GeoJSON (no
style)
- New architecture to accommodate different projects
using the same scripts
11
SLIDE 24 2018 – 2019: Layout and backend changes
July 2018 www.dialektkarten.ch breaks due to Google Maps changes of terms of service
- Jan. 2019 Comprehensive update with mostly under-the-hood
changes:
- Three applications with same layout: map viewer,
translation demonstrator, dialectometry viewer
- Change from Google Maps API to Leafmet API with
Stamen maps (based on OSM)
- Change from KML (containing style) to GeoJSON (no
style)
- New architecture to accommodate different projects
using the same scripts
11
SLIDE 25 2018 – 2019: Layout and backend changes
July 2018 www.dialektkarten.ch breaks due to Google Maps changes of terms of service
- Jan. 2019 Comprehensive update with mostly under-the-hood
changes:
- Three applications with same layout: map viewer,
translation demonstrator, dialectometry viewer
- Change from Google Maps API to Leafmet API with
Stamen maps (based on OSM)
- Change from KML (containing style) to GeoJSON (no
style)
- New architecture to accommodate different projects
using the same scripts
11
SLIDE 26 2018 – 2019: Layout and backend changes
July 2018 www.dialektkarten.ch breaks due to Google Maps changes of terms of service
- Jan. 2019 Comprehensive update with mostly under-the-hood
changes:
- Three applications with same layout: map viewer,
translation demonstrator, dialectometry viewer
- Change from Google Maps API to Leafmet API with
Stamen maps (based on OSM)
- Change from KML (containing style) to GeoJSON (no
style)
- New architecture to accommodate different projects
using the same scripts
11
SLIDE 27
2018 – 2019: Layout and backend changes
12
SLIDE 28
2018 – 2019: Layout and backend changes
13
SLIDE 29
2018 – 2019: Layout and backend changes
14
SLIDE 30 2018 – 2019: Addressing the backlog
June 2019 Addition of dialectometry maps from the Scherrer & Stoeckle (2016) paper (V3)
- New import procedure as these experiments were
carried out without Goebl’s VDM application
June 2019 New subprojects for ALF, AIS and SED, based on
15
SLIDE 31 2018 – 2019: Addressing the backlog
June 2019 Addition of dialectometry maps from the Scherrer & Stoeckle (2016) paper (V3)
- New import procedure as these experiments were
carried out without Goebl’s VDM application
June 2019 New subprojects for ALF, AIS and SED, based on
15
SLIDE 32
2018 – 2019: New AIS subproject
16
SLIDE 33
2018 – 2019: New ALF subproject
17
SLIDE 34
2018 – 2019: New SED subproject
18
SLIDE 35 2018 – 2019: Coming up
General:
- Server change (hopefully transparent)
- Decision on the future of the translation and
identifjcation demonstrator pages
- Long-term goal: add crowd-sourcing and corpus projects
Swiss German:
- Add single feature maps of V3 material, update download
links (only dialectometrical visualisations are online now) Italian / French / English:
- Move to dedicated server
- Multilingual versions of interface (DE, EN, FR, IT)
- Additional visualisations: correlation maps,
single feature maps (?)
19
SLIDE 36 2018 – 2019: Coming up
General:
- Server change (hopefully transparent)
- Decision on the future of the translation and
identifjcation demonstrator pages
- Long-term goal: add crowd-sourcing and corpus projects
Swiss German:
- Add single feature maps of V3 material, update download
links (only dialectometrical visualisations are online now) Italian / French / English:
- Move to dedicated server
- Multilingual versions of interface (DE, EN, FR, IT)
- Additional visualisations: correlation maps,
single feature maps (?)
19
SLIDE 37 2018 – 2019: Coming up
General:
- Server change (hopefully transparent)
- Decision on the future of the translation and
identifjcation demonstrator pages
- Long-term goal: add crowd-sourcing and corpus projects
Swiss German:
- Add single feature maps of V3 material, update download
links (only dialectometrical visualisations are online now) Italian / French / English:
- Move to dedicated server
- Multilingual versions of interface (DE, EN, FR, IT)
- Additional visualisations: correlation maps,
single feature maps (?)
19
SLIDE 38 Technical details
- All online visualisations are based on static GeoJSON and
dynamically generated JSON fjles
- The browser console lists all fjles as they are loaded
- Source code is on a private GitHub repository, but can be
made available
20
SLIDE 39
Conclusions
Have a look at www.dialektkarten.ch !
And let me know if there is anything that does not meet your expectations yves.scherrer@helsinki.fi
21
SLIDE 40
Conclusions
Have a look at www.dialektkarten.ch !
And let me know if there is anything that does not meet your expectations yves.scherrer@helsinki.fi
21