AUTOMATIC IMPROVEMENT OF POINT-OF-INTEREST TAGS FOR OPENSTREETMAP - - PowerPoint PPT Presentation

automatic improvement of point of interest tags for
SMART_READER_LITE
LIVE PREVIEW

AUTOMATIC IMPROVEMENT OF POINT-OF-INTEREST TAGS FOR OPENSTREETMAP - - PowerPoint PPT Presentation

AUTOMATIC IMPROVEMENT OF POINT-OF-INTEREST TAGS FOR OPENSTREETMAP DATA Stefan Funke and Sabine Storandt <node id="2955673661"> <tag k="amenity" v="restaurant"/> <tag k="name" v="


slide-1
SLIDE 1

AUTOMATIC IMPROVEMENT OF POINT-OF-INTEREST TAGS FOR OPENSTREETMAP DATA

Stefan Funke and Sabine Storandt

<node id="2955673661"> <tag k="amenity" v="restaurant"/> <tag k="name" v=" "/> </node>

ᄉ ᅩᄇ ᅢ ᆨᄉ ᅡ ᆫ ᄋ ᅣ ᆼᄌ ᅢᄇ ᅧ ᆯᄀ ᅪ ᆫ

slide-2
SLIDE 2

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\>

slide-3
SLIDE 3

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar

slide-4
SLIDE 4

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese

slide-5
SLIDE 5

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar

slide-6
SLIDE 6

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish

slide-7
SLIDE 7

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia

slide-8
SLIDE 8

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian

slide-9
SLIDE 9

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner

slide-10
SLIDE 10

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american

slide-11
SLIDE 11

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway

slide-12
SLIDE 12

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american

slide-13
SLIDE 13

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american chau’s wok

slide-14
SLIDE 14

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american chau’s wok chinese

slide-15
SLIDE 15

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american chau’s wok chinese fresh fish buns

slide-16
SLIDE 16

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american chau’s wok chinese fresh fish buns seafood

slide-17
SLIDE 17

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american chau’s wok chinese fresh fish buns seafood mykonos restaurant

slide-18
SLIDE 18

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american chau’s wok chinese fresh fish buns seafood mykonos restaurant greek

slide-19
SLIDE 19

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american chau’s wok chinese fresh fish buns seafood mykonos restaurant greek taj mahal

slide-20
SLIDE 20

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american chau’s wok chinese fresh fish buns seafood mykonos restaurant greek taj mahal indian

slide-21
SLIDE 21

A LITTLE TAGGING GAME

<tag k="amenity" v="restaurant"\> <tag k="name" v="..."\> <tag k="cuisine" v="???"\> izumi sushi bar sushi; japanese toro blanco tapas bar tapas; spanish pizzaria bella italia pizza; italian 50’s diner burger; american subway sandwich; american chau’s wok chinese fresh fish buns seafood mykonos restaurant greek taj mahal indian ⇒ Machine Learning to deduce new tags from the name tag automatically

slide-22
SLIDE 22

EXTRAPOLATABLE TAGS

slide-23
SLIDE 23

EXTRAPOLATABLE TAGS

OSM Wiki provides overview of reasonable tags.

slide-24
SLIDE 24

EXTRAPOLATABLE TAGS

OSM Wiki provides overview of reasonable tags. We only consider tags which occur at least 200 times in our data set.

slide-25
SLIDE 25

EXTRAPOLATABLE TAGS

OSM Wiki provides overview of reasonable tags. We only consider tags which occur at least 200 times in our data set. ... 25 out of over 1500 cuisine classes remained ... not considered, e.g.

home made cake (too specific) german-bohemian (home-brewed) b¨ urgerliche k¨ uche (not in English) music (wrong usage) israelian (indeed rare) chineese (wrong spelling)

slide-26
SLIDE 26

FEATURE EXTRACTION

punjab moghul mahal indian palace simran shezan maharani taj palace rama aanjal namaskar bombay tandoori ganesha flavours of india satyam palace of india satluj safran saaz shaan taj mahal anmol carmen mehefil india haus maharani chai ji amaltas krishna (indisch) amrit surya maharadscha taste of india kashmir yogi haus jaipur ghandi badsha maharadscha shahi rama sitar maharaja indian place el sol badshah india house indian mango shalimar shivalik goa dhaba indira gandhi krishna delhi palace express shere punjab jai pur kashmirhaus dehli palace swagat shiva zum ratskeller indian garden the rambagh palace sher e punjab namaste shalimar bombay maharadscha maharani natraj radha shere punjab himalaya goa indian palace swagatam bella punjabi shanti shop curry king swagat shiva’s thali gasthaus adler shan lahori

slide-27
SLIDE 27

FEATURE EXTRACTION

punjab moghul mahal indian palace simran shezan maharani taj palace rama aanjal namaskar bombay tandoori ganesha flavours of india satyam palace of india satluj safran saaz shaan taj mahal anmol carmen mehefil india haus maharani chai ji amaltas krishna (indisch) amrit surya maharadscha taste of india kashmir yogi haus jaipur ghandi badsha maharadscha shahi rama sitar maharaja indian place el sol badshah india house indian mango shalimar shivalik goa dhaba indira gandhi krishna delhi palace express shere punjab jai pur kashmirhaus dehli palace swagat shiva zum ratskeller indian garden the rambagh palace sher e punjab namaste shalimar bombay maharadscha maharani natraj radha shere punjab himalaya goa indian palace swagatam bella punjabi shanti shop curry king swagat shiva’s thali gasthaus adler shan lahori

Good indicator phrases? indian palace mahal taj bombay mahara

slide-28
SLIDE 28

FEATURE EXTRACTION

punjab moghul mahal indian palace simran shezan maharani taj palace rama aanjal namaskar bombay tandoori ganesha flavours of india satyam palace of india satluj safran saaz shaan taj mahal anmol carmen mehefil india haus maharani chai ji amaltas krishna (indisch) amrit surya maharadscha taste of india kashmir yogi haus jaipur ghandi badsha maharadscha shahi rama sitar maharaja indian place el sol badshah india house indian mango shalimar shivalik goa dhaba indira gandhi krishna delhi palace express shere punjab jai pur kashmirhaus dehli palace swagat shiva zum ratskeller indian garden the rambagh palace sher e punjab namaste shalimar bombay maharadscha maharani natraj radha shere punjab himalaya goa indian palace swagatam bella punjabi shanti shop curry king swagat shiva’s thali gasthaus adler shan lahori

N For every name in N we construct all k-grams with k = 3, . . . , 10. k-gram – substring of length k

example ’taj mahal’, k=4: taj , aj m, j ma, mah, maha, ahal

slide-29
SLIDE 29

FEATURE EXTRACTION

punjab moghul mahal indian palace simran shezan maharani taj palace rama aanjal namaskar bombay tandoori ganesha flavours of india satyam palace of india satluj safran saaz shaan taj mahal anmol carmen mehefil india haus maharani chai ji amaltas krishna (indisch) amrit surya maharadscha taste of india kashmir yogi haus jaipur ghandi badsha maharadscha shahi rama sitar maharaja indian place el sol badshah india house indian mango shalimar shivalik goa dhaba indira gandhi krishna delhi palace express shere punjab jai pur kashmirhaus dehli palace swagat shiva zum ratskeller indian garden the rambagh palace sher e punjab namaste shalimar bombay maharadscha maharani natraj radha shere punjab himalaya goa indian palace swagatam bella punjabi shanti shop curry king swagat shiva’s thali gasthaus adler shan lahori

N For every name in N we construct all k-grams with k = 3, . . . , 10. k-gram – substring of length k

example ’taj mahal’, k=4: taj , aj m, j ma, mah, maha, ahal

For all k-grams we count their

  • ccurencies in N.

example: taj 2 maha 9

slide-30
SLIDE 30

FEATURE EXTRACTION

punjab moghul mahal indian palace simran shezan maharani taj palace rama aanjal namaskar bombay tandoori ganesha flavours of india satyam palace of india satluj safran saaz shaan taj mahal anmol carmen mehefil india haus maharani chai ji amaltas krishna (indisch) amrit surya maharadscha taste of india kashmir yogi haus jaipur ghandi badsha maharadscha shahi rama sitar maharaja indian place el sol badshah india house indian mango shalimar shivalik goa dhaba indira gandhi krishna delhi palace express shere punjab jai pur kashmirhaus dehli palace swagat shiva zum ratskeller indian garden the rambagh palace sher e punjab namaste shalimar bombay maharadscha maharani natraj radha shere punjab himalaya goa indian palace swagatam bella punjabi shanti shop curry king swagat shiva’s thali gasthaus adler shan lahori

N For every name in N we construct all k-grams with k = 3, . . . , 10. k-gram – substring of length k

example ’taj mahal’, k=4: taj , aj m, j ma, mah, maha, ahal

For all k-grams we count their

  • ccurencies in N.

example: taj 2 maha 9

Significant k-grams are contained in at least 2% of names in N. Prune k-grams that are substrings of

  • ther significant k-grams if the

freuency is the same.

example:

  • nald 753

mc donald’s 753

slide-31
SLIDE 31

MACHINE LEARNING

slide-32
SLIDE 32

MACHINE LEARNING

For each class, we have a list of indicator phrases with percentages.

class indian italian greek chinese indian maha palace 14.35 12.69 6.50 gri 11.95 tavern 9.02 akropolis 4.56 ria 25.50 pizz 21.19 china 26.30 asia 10.98 ing 8.49 ang 7.79

slide-33
SLIDE 33

MACHINE LEARNING

For each class, we have a list of indicator phrases with percentages.

class indian italian greek chinese indian maha palace 14.35 12.69 6.50 gri 11.95 tavern 9.02 akropolis 4.56 ria 25.50 pizz 21.19 china 26.30 asia 10.98 ing 8.49 ang 7.79

For each name we construct a feature-vector: ⇒ Otherwise phrase length multiplied with percentage. ⇒ 0 if phrase is not contained in the name.

indian mango 86.1 23.4 pizzaria trulli 76.5 84.8

slide-34
SLIDE 34

MACHINE LEARNING

For each class, we have a list of indicator phrases with percentages.

class indian italian greek chinese indian maha palace 14.35 12.69 6.50 gri 11.95 tavern 9.02 akropolis 4.56 ria 25.50 pizz 21.19 china 26.30 asia 10.98 ing 8.49 ang 7.79

For each name we construct a feature-vector: ⇒ Otherwise phrase length multiplied with percentage. ⇒ 0 if phrase is not contained in the name.

indian mango 86.1 23.4 pizzaria trulli 76.5 84.8

Machine Learning on tagged data. ⇒ Random Forest ⇒ returns probability distribution over all possible classes

indian mango pizzaria trulli 90% 0% 0% 10% 0% 0% 100% 0%

slide-35
SLIDE 35

EXPERIMENTAL RESULTS

slide-36
SLIDE 36

EXPERIMENTAL RESULTS

slide-37
SLIDE 37

EXPERIMENTAL RESULTS

28,128 restaurants without cuisine tag. Assigned ethnicity tag when probability > 75%, food type when probability 100%.

slide-38
SLIDE 38

EXPERIMENTAL RESULTS

28,128 restaurants without cuisine tag. Assigned ethnicity tag when probability > 75%, food type when probability 100%. 19,671 new ethnicity cuisine tags and 1,460 new food type cuisine tags. Manual checks (500) showed an accuracy of 98%.

fischerklause c = seafood la stella c = pizza eiscaf´ e rialto c = ice_cream pizzeria italia c = pizza 50’s diner c = burger block house c = steak_house calimero c = ice_cream nordsee c = seafood pizzeria marino c = pizza rosenburger hof c = burger nazar kebap stube c = kebab chilli peppers rock cafe c = coffee_shop eiscaf´ e dolce vita c = ice_cream baguetterie filou c = sandwich classic western steakhouse c = steak_house shaki sushi c = sushi cafe kamps c = coffee_shop sakura sushi & grill c = sushi speisekammer c = ice_cream d¨

  • ner haus

c = kebab schwaben-br¨ au c = german ginnheimer wirtshaus c = german china imbiss drache c = chinese gameiro pizza-express c = italian taverna ilios c = greek zur feurigen bratwurst c = german pizzeria italia c = italian kartoffelhaus c = german pizzeria venezia c = italian einkehr c = german sushi for friends c = japanese il capriccio c = italian deutscher hof c = german mykonos c = greek sausalitos c = mexican my thai c = thai

  • mr. kebab

c = turkish dschingis khan c = chinese el paso c = mexican caf´ e mallorca c = spanish

slide-39
SLIDE 39

OTHER RESULTS

Consider POIs which only have a name tag. 461 new food related tags (restaurant, bar, biergarten, cafe) 4,212 new amenity and shop tags (supermarket, bakery, hairdresser, etc.) 3,452 new tourism and leisure tags (hotel, playground, sports centre) Overall accuracy 85%.

slide-40
SLIDE 40

OTHER RESULTS

Consider POIs which only have a name tag. 461 new food related tags (restaurant, bar, biergarten, cafe) 4,212 new amenity and shop tags (supermarket, bakery, hairdresser, etc.) 3,452 new tourism and leisure tags (hotel, playground, sports centre) Overall accuracy 85%. Could be integrated in a dialogue system.

<node> <tag k="name" v="Walmart"\> </node> Would you like to add a shop=supermarket tag? yes no <node> <tag k="name" v="Walmart"\> <tag k="shop" v="supermarket"\> </node>

slide-41
SLIDE 41

CONCLUSIONS AND FUTURE WORK

A large portion of POI names already contains information about e.g. the amenity or cuisine.

slide-42
SLIDE 42

CONCLUSIONS AND FUTURE WORK

A large portion of POI names already contains information about e.g. the amenity or cuisine. Future work Consider additional tags beside the name tag for extrapolation, e.g. the opening hours tag, the brand tag or free text tags as note or description tags. Perform experiments on other countries.

slide-43
SLIDE 43

CONCLUSIONS AND FUTURE WORK

A large portion of POI names already contains information about e.g. the amenity or cuisine. Future work Consider additional tags beside the name tag for extrapolation, e.g. the opening hours tag, the brand tag or free text tags as note or description tags. Perform experiments on other countries. Thank you for your attention!