Visual Analytics for Linguists
Miriam Butt & Chris Culy ESSLII 2014, Introductory Course Tübingen
1
Visual Analytics for Linguists Miriam Butt & Chris Culy ESSLII - - PowerPoint PPT Presentation
Visual Analytics for Linguists Miriam Butt & Chris Culy ESSLII 2014, Introductory Course Tbingen 1 Course Overview Day 1: LingVis First Look at Possible Visualizations for Linguistics Basics of Visualization (Theory) Day
1
2
3
4
5
6
abilities of the computer General Knowledge Creativity Logic Data Storage Numerical Computation Planning Prediction Diagnosis Searching Perception human abilities
7
8
The 8 visual variables (Bertin 1982)
detect distributional patterns in language data.
analysis and visualization (e.g., Python, R).
9
10
11
task modelling, algorithmic processing, statistical analyses
12
investigate interactively mapping to visual variables, design, layout algorithms
task modelling, algorithmic processing, statistical analyses
13
14
15
(within words — again an approximation)
vowels: normalized association strength value ϕ.
Languages like Spanish or German pattern differently.
16
Turkish Spanish German Hungarian
17
Turkish Spanish German Hungarian
18
Turkish Spanish German Hungarian
19
Counting Vowel Successions in all Bible Types Example: Finnish Statistics & Visualization Sorting
21
Hungarian Breton Ukrainian Tagalog Finnish Indonesian Turkish Maori Warlpiri
22
23
500 1000 1500 0.00 0.02 0.04 0.06 0.08 0.10 Number of Different Types Average Deviation of Matrix Entries from Gold Standard
24
25
Mayer, Thomas and Christian Rohrdantz. 2013. PhonMatrix: Visualizing co-occurrence constraints in sounds. In Proceedings of the ACL 2013 System Demonstration.
26
27
X,kar,ho,hu,rakh, hAsil,0.771,0.222,0.0070,0.0 bAt,0.853,0.147,0.0,0.0 istamAl,0.873,0.121,0.0060,0.0 kOSiS,0.823,0.177,0.0,0.0 band,0.695,0.261,0.0,0.045 hamlah,0.79,0.064,0.146,0.0 zAhir,0.699,0.289,0.012,0.0 sAmnA,0.686,0.301,0.013,0.0 ....
28
(become) (put) (do) (be) (achievement) (announcement) (talk) (beginning)
29
do be bec. put
30
31
Andreas Lamprecht, Annette Hautli, Christian Rohrdantz, Tina Bögel. 2013. A Visual Analytics System for Cluster Exploration. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, System Demo, 109–114, Sofia, Bulgaria.
32
33
Norwegian shows language change a è e in comparison to Swedish.
35
Christian Rohrdantz, Michael Hund, Thomas Mayer, Bernhard Wälchli and Daniel A. Keim. 2012. The World’s Languages Explorer: Visual Analysis of Language Features in Geneaologica and Areal contexts. Computer Graphics Forum 31(3), 935-944.
36
Each circle segment represents one language, each ring the values of one feature across all languages. Comparing 126 Languages of Papua New-Guinea based on the New Testament.
Bringing genealogy (left) and areal distributions (right) interactively into context: The values of a selected feature ring are color-coded on a map for exploration.
38
40
41
42
Thomas Mayer, Bernhard Wälchli, Christian Rohrdantz and Michael Hund.
analytics of heterogeneous areal-typological datasets. In B. Nolan and C. Periñán-Pascual (eds.), Language Processing and Grammars: The role of functionally oriented computational models, 13–38. John Benjamins.
43
Daniel Angus, Andrew E. Smith, Janet Wiles: Conceptual Recurrence Plots: Revealing Patterns in Human Discourse. IEEE Trans. Vis.
Saturation shows how much b relates to a (content-wise)
44
45
46
47
48