how elasticsearch powers the guardian s newsroom
play

How Elasticsearch powers the Guardians newsroom shay banon @kimchy - PowerPoint PPT Presentation

How Elasticsearch powers the Guardians newsroom shay banon @kimchy phil wills @philwills creator, co-founder and cto senior software architect elasticsearch guardian news and media created in 1936 ... to secure the financial and


  1. How Elasticsearch powers the Guardian’s newsroom shay banon ■ @kimchy phil wills ■ @philwills creator, co-founder and cto senior software architect elasticsearch guardian news and media

  2. “created in 1936 ... to secure the financial and editorial independence of the Guardian in perpetuity”

  3. our in-house real-time traffic tool

  4. production apaches desktop workstation something ? htmly

  5. ssh $SERVER "nice tail -f /apache2/logs/guardian-access_log"

  6. 2 x production apaches desktop workstation ssh “tail” SEO zeromq publisher dashboard x

  7. x desktop workstation

  8. Javascript in browser hidden pixel Tracker SNS SQS Dashboard

  9. Elasticsearch “you know, for search”

  10. Javascript in browser image pixel Tracker SNS SQS SQS Serf Dashboard elasticsearch Dashboard

  11. 6 * c3.4xlarge instance store (SSD) in an autoscaling group (with manual scaling) https://github.com/guardian/status-app

  12. { ⇠ count per minute "dt": "2014-06-13T20:01:48.026Z", "url": "http://www.theguardian.com/football/2014/jun/13/spain-v-holland-world-cup-2014- live-report", "queryString": "", "host": "www.theguardian.com", ⇠ filter "path": "/football/2014/jun/13/spain-v-holland-world-cup-2014-live-report", "section": "football", "platform": "r2", "userAgent": { "type": "Browser", "family": "Safari 5.1.9", "os": "OS X 10.6.8", "device": "Personal computer" }, "documentReferrer": "http://www.theguardian.com/football", "browser": { "id": "gA6RUFLhWNQvWdt0rW4r78Fg", "isNew": false }, ⇠ filter "referringHost": "theguardian.com", "referringPath": "/football", "isContent": true, "contentPublicationDate": "2014-03-03", "countryCode": "US", "countryName": "United States", "location": { "lonlat": [-73.4409, 41.2094] } }

  13. { "query" : { "filtered" : { "query" : { "match_all" : { } }, "filter" : { "term" : { "path": "/football/2014/jun/13/spain-v-holland-world-cup-2014-live-report" } } } }, …

  14. … "facets": { "Reddit": { "date_histogram": { "field": "dt", "interval": "1m" }, "facet_filter": { "term": { "referringHost": "reddit.com" } } }, "Facebook": { "date_histogram": { "field": "dt", "interval": "1m" }, "facet_filter": { "term": { "referringHost": "facebook.com" } } }, "Google": { "date_histogram": { "field": "dt", "interval": "1m" }, "facet_filter": { "or": { "filters": [ { "prefix": { "referringHost": "www.google." } }, { "prefix": { "referringHost": "news.google." } } ] } } } } }

  15. "aggregations" : { "dns" : { "date_histogram" : { "field" : "dt", "interval" : "1m" }, "aggregations" : { "dns" : { "percentiles" : { "field" : "dns", "percents" : [ 50.0 ], "estimator" : "tdigest", "compression" : 10.0 } } } } }

  16. /graph/breakdown?section=commentisfree

  17. ?section=commentisfree ophan.StandardFilters ophan.StandardFiltersToElasticsearch org.elasticsearch.index. query.FilterBuilder

  18. { "query" : { "filtered" : { "query" : { "match_all" : { } }, "filter" : { "term" : { "path": "/football/2014/jun/13/spain-v-holland-world-cup-2014-live-report" } } } }, …

  19. "filter": { "and": { "filters": [ { "range": { "dt": { "from": "2014-03-03T00:00:00.000Z", "to": "2014-03-03T22:30:59.999Z", "include_lower": true, "include_upper": false } } }, { "not": { "filter": { "term": { "countryCode": "GNM" } } } }, { "not": { "filter": { "term": { "userAgent.type": "Robot" } } } }, { "filter": { "terms": { "section": [ "commentisfree" ] }} } ] } }

  20. thank you shay banon ■ @kimchy phil wills ■ @philwills creator, co-founder and cto senior software architect elasticsearch guardian news and media

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend