Apache: Big Data 2015
Combining Solr and Elasticsearch to Improve Autosuggestion
- n Mobile Local Search
Toan Vinh Luu, PhD Senior Search Engineer local.ch AG
Combining Solr and Elasticsearch to Improve Autosuggestion on Mobile - - PowerPoint PPT Presentation
Combining Solr and Elasticsearch to Improve Autosuggestion on Mobile Local Search Toan Vinh Luu, PhD Senior Search Engineer local.ch AG Apache: Big Data 2015 In this talk Requirements of an autosuggestion feature Autosuggestion
Apache: Big Data 2015
Toan Vinh Luu, PhD Senior Search Engineer local.ch AG
Apache: Big Data 2015
Apache: Big Data 2015
– > 4 millions unique users – > 8 millions queries on mobile (iOS, android,…)
– Services (e.g “restaurant zurich”) – Resident information (e.g “toan luu”) – Phone number (e.g. 079574xxyy) – Addresses, weather, – ...
Apache: Big Data 2015
User taps on the phone 8 times instead of 34 times to get to the result list when searching for “Electric installation Wallisellen”
Apache: Big Data 2015
Apache: Big Data 2015
Apache: Big Data 2015
>2000 queries/month for “cablecom” which have only 1 entry “mc donalds” has less entries than “muller” but is queried >10x
Apache: Big Data 2015
Apache: Big Data 2015
>700’000 mistakes per month on mobile (9%)
Apache: Big Data 2015
Apache: Big Data 2015
Apache: Big Data 2015
Autosuggest API/Search API
SuggestData component Query history component Popular query component Spellchecker component
Index Index Index Index Index Query log Popular query processor Local.ch Database
Apache: Big Data 2015
Possible suggested queries:
Apache: Big Data 2015
– Search field used for matching, apply analyzers, tokenizer… – Facet field used for displaying and for computing frequency
– q=restaurant zu* => suggest “Restaurant Zürich” – q=zurich restau* => suggest “Restaurant Zürich”
Apache: Big Data 2015
Apache: Big Data 2015
– 4 languages are used in Switzerland. Fail if we suggest “bäckerei” for a French speaking user
– Fail if we suggest a hospital in Zurich for an user in Geneva
– Fail if we suggest “zürich” and “züruch”
– Fail if we suggest “toan” just because I searched my name thousands of times
– Fail if we suggest “f**k”, “pe**is”
Apache: Big Data 2015
– Text normalization, stopword, blacklist, keep only queries return results…
{ "q": "restaurant", "language": "de", "lon": 8.50646, "lat": 47.4192, "datetime": "2014-06-02 11:10:07”, "user": “eeaad0c09abc41676c1c99530693” }
Apache: Big Data 2015
{ "query" : { "query_string" : { "query" : "language:" + language } }, "facets" : { "q" : { "terms" : { "field" : "q.untouched", "size" : TOP_POPULAR } } } }
Apache: Big Data 2015
{ "query" : { "query_string" : { "query" : "q.untouched:" + query } }, "aggs": { "num_users": { "cardinality": { "field": "user" } } } }
Apache: Big Data 2015
50 100 150 200 250 300 5.95 6.05 6.15 6.25 6.35 6.45 6.55 6.65 6.75 6.85 6.95 7.05 7.15 7.25 7.35 7.45 7.55 7.65 7.75 7.85 7.95 8.05 8.15 8.25 8.35 8.45 8.55 8.65 8.75 8.85 8.95 9.05 9.15 9.25 9.35 9.45 9.55 9.65 9.75 9.85 9.95 10.05 10.15 10.25 10.35 10.45
90% Popular query: Chuv (Centre Hospitalier Universitaire Vaudois)
Apache: Big Data 2015
45.81 45.88 45.95 46.02 46.09 46.16 46.23 46.3 46.37 46.44 46.51 46.58 46.65 46.72 46.79 46.86 46.93 47 47.07 47.14 47.21 47.28 47.35 47.42 47.49 47.56 47.63 47.7 47.77 5.95 6.04 6.13 6.22 6.31 6.4 6.49 6.58 6.67 6.76 6.85 6.94 7.03 7.12 7.21 7.3 7.39 7.48 7.57 7.66 7.75 7.84 7.93 8.02 8.11 8.2 8.29 8.38 8.47 8.56 8.65 8.74 8.83 8.92 9.01 9.1 9.19 9.28 9.37 9.46 9.55 9.64 9.73 9.82 9.91 10 10.09 10.18 10.27 10.36 10.45
Apache: Big Data 2015
46.5243,6.6397 46.52,6.63 46.53,6.64
Apache: Big Data 2015
"query" : { "match" : {"q" : {"query" :”chuv”}} }, "aggs" : { "lat_outlier" : { "percentiles" : { "field" : "lat", "percents" : [5, 95] } }, "lon_outlier" : { "percentiles" : { "field" : "lon", "percents" : [5, 95] } } }
Apache: Big Data 2015
Apache: Big Data 2015
Apache: Big Data 2015
Apache: Big Data 2015
Release date
Apache: Big Data 2015
0.5 1 1.5 2 2.5
Apache: Big Data 2015
Apache: Big Data 2015