- L. Moncla
ludovic.moncla@univ-pau.fr
GIScience 2014
Automatic Itinerary Reconstruction from Texts
- L. Moncla (LIUPPA, IAAA), M. Gaio (LIUPPA), S. Mustière (COGIT)
Automatic Itinerary Reconstruction from Texts L. Moncla (LIUPPA, - - PowerPoint PPT Presentation
Automatic Itinerary Reconstruction from Texts L. Moncla (LIUPPA, IAAA), M. Gaio (LIUPPA), S. Mustire (COGIT) L. Moncla ludovic.moncla@univ-pau.fr GIScience 2014 GIScience 2014 2/56 Automatic Itinerary Reconstruction from Texts L.
ludovic.moncla@univ-pau.fr
GIScience 2014
Automatic Itinerary Reconstruction from Texts
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 2/56
1
Introduction
2
Solution adopted
3
Implementation
4
Conclusion
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 3/56
1
Introduction Objectives Corpus
2
Solution adopted
3
Implementation
4
Conclusion
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 4/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
Traverser Champagny-le-Haut et contourner le hameau de Friburge. Vous apercevrez le Lac de la Plagne puis marcher jusqu’au refuge au sud du lac de Grattaleu.
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 4/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
Traverser Champagny-le-Haut et contourner le hameau de Friburge. Vous apercevrez le Lac de la Plagne puis marcher jusqu’au refuge au sud du lac de Grattaleu.
Who understand ?
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 4/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
Cross Champagny-le-Haut and get around from the north of hamlet
south of lake Grattaleu.
better ? Where is it ?
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 4/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
and now ?
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 5/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
From text... Traverser Champagny-le-Haut et contourner le hameau de Friburge. Vous apercevrez le Lac de la Plagne puis marcher jusqu’au refuge au sud du lac de Grattaleu. ...to map.
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 6/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
Automatic itinerary reconstruction from texts
Main problems
1 Expression of space and motion in language 2 Automatic information extraction 3 Assigning location to spatial information 4 Route calculation
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 7/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
Main problems
1 Expression of space and motion in language [Talmy,1985]
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 8/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
Main problems
2 Automatic information extraction
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 9/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
Main problems
3 Assigning location to spatial information
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 10/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Objectives
Main problems
4 Route calculation
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 11/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Corpus
Corpus of experiments
Body of reference
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 11/40
Contents Introduction
Objectives Corpus
Solution adopted Implementation Conclusion
Corpus
Corpus of experiments
Body of reference
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 12/56
1
Introduction
2
Solution adopted Geoparsing Itinerary calculation
3
Implementation
4
Conclusion
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 13/40
Contents Introduction Solution adopted
Geoparsing Itinerary calculation
Implementation Conclusion
Geoparsing
Automatic annotation of spatial expressions
Expanded spatial named entities (ESNE)
1 : left side of the Danube river 2 : the refuge south of lake Grattaleu
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 13/40
Contents Introduction Solution adopted
Geoparsing Itinerary calculation
Implementation Conclusion
Geoparsing
Automatic annotation of spatial expressions
Expanded spatial named entities (ESNE)
1 : left side of the Danube river 2 : the refuge south of lake Grattaleu
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 14/40
Contents Introduction Solution adopted
Geoparsing Itinerary calculation
Implementation Conclusion
Geoparsing
Automatic annotation of spatial expressions
Expression of motion
VTo structures
1 : Reach the left side of the Danube river 2 : Walk to the refuge south of lake Grattaleu Polarity of motion verbs Initial verbs : Final verbs : Median verbs :
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 14/40
Contents Introduction Solution adopted
Geoparsing Itinerary calculation
Implementation Conclusion
Geoparsing
Automatic annotation of spatial expressions
Expression of motion
VTo structures
1 : Reach the left side of the Danube river 2 : Walk to the refuge south of lake Grattaleu
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 14/40
Contents Introduction Solution adopted
Geoparsing Itinerary calculation
Implementation Conclusion
Geoparsing
Automatic annotation of spatial expressions
Expression of motion
VTo structures
1 : Reach the left side of the Danube river 2 : Walk to the refuge south of lake Grattaleu
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 14/40
Contents Introduction Solution adopted
Geoparsing Itinerary calculation
Implementation Conclusion
Geoparsing
Automatic annotation of spatial expressions
Expression of motion
VTo structures
1 : Reach the left side of the Danube river 2 : Walk to the refuge south of lake Grattaleu
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 14/40
Contents Introduction Solution adopted
Geoparsing Itinerary calculation
Implementation Conclusion
Geoparsing
Automatic annotation of spatial expressions
Expression of motion
VTo structures
1 : Reach the left side of the Danube river 2 : Walk to the refuge south of lake Grattaleu
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 15/40
Contents Introduction Solution adopted
Geoparsing Itinerary calculation
Implementation Conclusion
Itinerary calculation
Combine language and spatial analysis to order places and reconstruct the path
destination)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 16/40
Contents Introduction Solution adopted
Geoparsing Itinerary calculation
Implementation Conclusion
Itinerary calculation
Minimum weight spanning tree
FIGURE : Minimum weight spanning tree (source : wikipedia)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 17/56
1
Introduction
2
Solution adopted
3
Implementation Geoparsing Itinerary calculation
4
Conclusion
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 18/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Implementation
FIGURE : Block diagram of our processing chain
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 19/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Geoparsing
Finite-state transducers cascade
Our linguistic rules are implemented using a finite-state transducers cascade Input
example : {Walk,.V} {to,.PREP} {the,.ART} {refuge,.N} {south,.N} {of,.PREP} {lake,.N} {Grattaleu,.NPr} {.,.PUN} Main transducers
1 Indirections (spatial relations) 2 Candidate toponyms (+ sub-type) 3 ESNE 4 Motion, perception (VTo Structures)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 19/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Geoparsing
Finite-state transducers cascade
Our linguistic rules are implemented using a finite-state transducers cascade Input
example : {Walk,.V} {to,.PREP} {the,.ART} {refuge,.N} {south,.N} {of,.PREP} {lake,.N} {Grattaleu,.NPr} {.,.PUN} Main transducers
1 Indirections (spatial relations) 2 Candidate toponyms (+ sub-type) 3 ESNE 4 Motion, perception (VTo Structures)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 20/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Geoparsing
Finite-state transducers cascade
Example of transducer
FIGURE : Example of transducer in the Unitex platform
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 21/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Result of geoparsing
Cross Champagny-le-Haut and get around from the north of hamlet
south of lake Grattaleu. Execution of the transducers cascade
1 Indirections (spatial relations) 2 Candidate toponyms (+ sub-type) 3 ESNE 4 Motion, perception (VTo Structures)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 21/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Result of geoparsing
Cross Champagny-le-Haut and get around from the north of hamlet
south of lake Grattaleu. Execution of the transducers cascade
1 Indirections (spatial relations) 2 Candidate toponyms (+ sub-type) 3 ESNE 4 Motion, perception (VTo Structures)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 21/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Result of geoparsing
Cross Champagny-le-Haut and get around from the north of hamlet
south of lake Grattaleu. Execution of the transducers cascade
1 Indirections (spatial relations) 2 Candidate toponyms (+ sub-type) 3 ESNE 4 Motion, perception (VTo Structures)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 21/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Result of geoparsing
Cross Champagny-le-Haut and get around from the north of hamlet
south of lake Grattaleu. Execution of the transducers cascade
1 Indirections (spatial relations) 2 Candidate toponyms (+ sub-type) 3 ESNE 4 Motion, perception (VTo Structures)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 21/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Result of geoparsing
Cross Champagny-le-Haut and get around from the north of hamlet
south of lake Grattaleu. Execution of the transducers cascade
1 Indirections (spatial relations) 2 Candidate toponyms (+ sub-type) 3 ESNE 4 Motion, perception (VTo Structures)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 22/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Result of geoparsing
Cross Champagny-le-Haut and get around from the north of hamlet
south of lake Grattaleu.
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 23/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Geoparsing
Finite-state transducers cascade
Preprocessing (a) (b) (c) Precision Recall Automatic POS 583 581 595 90.02% 94.68% POS manually corrected 583 559 636 98.29% 99.42%
TABLE : Evaluation of our geoparsing process
(a) number of toponyms manually annotated (b) number of relevant toponyms automatically annotated (c) number of toponyms automatically annotated
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 24/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Geocoding
Querying of gazetteers
Toponyms ambiguities
fine-grain toponyms)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 25/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Geocoding
Referent ambiguity : same name used for several places
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 26/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Geocoding
Structural ambiguity : ambiguity on words constituting the name
FR EN Frequency col pass 20 village town 20 hameau hamlet 20 rue road 17 chemin path 15 chalet cottage 13 refuge refuge 11 pont bridge 11 lac lake 8 chapelle chapel 8
TABLE : Most frequent terms associated with toponyms
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 27/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Geocoding
Toponyms desambiguation Result of geocoding + real GPS track
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 28/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Automatic itinerary reconstruction
Minimum spanning tree (euclidian distances)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 29/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Automatic itinerary reconstruction
Minimum spanning tree (euclidian distances)
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 30/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Automatic itinerary reconstruction
Minimum spanning tree using information extracted from the text
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 31/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Automatic itinerary reconstruction
Minimum spanning tree using information extracted from the text
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 32/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Automatic itinerary reconstruction
Minimum spanning tree using information extracted from the text
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 33/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Itinerary calculation
Automatic itinerary reconstruction
Minimum spanning tree using information extracted from the text
FIGURE : Automatic itinerary VS real GPS trace
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 34/40
Contents Introduction Solution adopted Implementation
Processing chain Geoparsing Itinerary calculation Online demonstration
Conclusion
Online geoparsing and geocoding tool
http ://erig.univ-pau.fr/PERDIDO/
FIGURE : Screenshot of the online geoparsing and geocoding tool
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 35/56
1
Introduction
2
Solution adopted
3
Implementation
4
Conclusion
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 36/40
Contents Introduction Solution adopted Implementation Conclusion
Conclusion
Expanded geoparsing process
displacement, perception, etc Geocoding and itinerary reconstruction
reached, etc)
Mixing spatial and textual analysis
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 37/40
Contents Introduction Solution adopted Implementation Conclusion
Work in progress
Multilingual
Toponyms desambiguation
Applications with Noise)
Full paper accepted in ACM SIGSPATIAL 2014
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 37/40
Contents Introduction Solution adopted Implementation Conclusion
Work in progress
Multilingual
Toponyms desambiguation
Applications with Noise)
Full paper accepted in ACM SIGSPATIAL 2014
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 38/40
Contents Introduction Solution adopted Implementation Conclusion
Outlook
Improve the route calculation using other information
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 39/40
Contents Introduction Solution adopted Implementation Conclusion
Outlook
The use of temporal information
Preliminary experiments
Automatic Itinerary Reconstruction from Texts
GIScience 2014 – 39/40
Contents Introduction Solution adopted Implementation Conclusion
Outlook
The use of temporal information
Preliminary experiments
Thank you for your attention
CONTACT
ludovic.moncla@univ-pau.fr http ://erig.univ-pau.fr/PERDIDO/