A centralised ElasticSearch service: Design ideas and status - - PowerPoint PPT Presentation
A centralised ElasticSearch service: Design ideas and status - - PowerPoint PPT Presentation
A centralised ElasticSearch service: Design ideas and status Motivation for a centralised ES service Project Mandat Organisation Team Strategy and design Status and resources Issues PaaS Conclusions/summary
ES service ideas
- Feb. 2 2016
2
A centralised ElasticSearch service: Design ideas and status
- Motivation for a centralised ES service
- Project
– Mandat – Organisation – Team
- Strategy and design
– Status and resources – Issues – PaaS
- Conclusions/summary
ES service ideas
- Feb. 2 2016
3
Motivation
- Strategy until 12/2015:
– Go and do it yourself – Basic setup description provided – No support for user owned instances
- Reasoning:
– Often conflicting user requirements – Risk of labor intensive installation which cannot be covered
with available man power
ES service ideas
- Feb. 2 2016
4
Motivation
Current status:
– O(20) clusters on the radar from checking foreman – Monitoring has 3 clusters for meter, timber and
development hosted on O(50) machines
– Many clusters use physical hardware ( because of I/O)
Significant amount of resources with unknown security setup
ES service ideas
- Feb. 2 2016
5
Project mandate
Setup a centralised ES service and consolidate existing installations.
- Offer something that can be
used out of the box
- Attractive service for
newcomers
- Cover as many use cases as
possible
- Offer plugins like kibana “as is”
ES service ideas
- Feb. 2 2016
6
Project organisation
- Twiki as entry point for service managers:
https://twiki.cern.ch/twiki/bin/view/IT/ElasticSearchWeb
- Agile project management
– JIRA ITES project – 4-6 weeks sprints (first just ended) – Daily scrums (but for Mondays) at 9:30 in 31-S-27
– End-user documentation on gitbook:
http://esdocs.web.cern.ch/esdocs/
ES service ideas
- Feb. 2 2016
7
Team
- Team spans across sections and groups
- Part-time participation from
– Compute and monitoring (4 people) – Databases (2 people)
ES service ideas
- Feb. 2 2016
8
Strategy and design
- Want to profit from existing experiences as much
as possible
- Get user feedback from the beginning
ES service ideas
- Feb. 2 2016
9
Strategy and design
- Lots of interest from the outside
– Rumors spread quickly – Official announcement went out to experiment lists asking for their
requirements
– Announcement in ITUM
- Received quite detailed responses already eg from
ATLAS
- User meetings took place already with ATLAS, DB, CDA
ES service ideas
- Feb. 2 2016
10
Strategy and design
- 2 options for the implementation
– Fully puppet managed and monitored – PaaS like approach
- Going for first option for now
– PaaS in parallel – Existing experiences by monitoring team with Heat – Playing with upcoming OpenShift service – Involve DB on Demand staff
ES service ideas
- Feb. 2 2016
11
Status and resources
- Plan for an as-big-as possible shared instance
– Big in terms of applications/number of users – Public data – Different QoS offerings:
- local (SSD based)
- Network attached with tunable IOPS
- Dedicated instances only for specific requirements,
eg specific security constraints or redundancy
ES service ideas
- Feb. 2 2016
12
Status and resources
- Default setup
– Support ES 2.x only – Kibana 4 given “as is” – 3 different host types:
- Search nodes
- Master nodes (redundant setup)
- Data nodes (2 different types, depending on QoS)
- Puppet setup
– Currently being developed
ES service ideas
- Feb. 2 2016
13
Status and resources
- Resources:
– Virtual machines only
- No containers
- No physical hardware
– Limited amount of resources to play with for now – Pending request for more resources RQF0536509
ES service ideas
- Feb. 2 2016
14
Issues to address
- Main issue: ACLs
– Used by SDC, required by others as well (eg DB) – Security module available by ES – Commercial, license fees depend on the number of nodes – Free security module in Alpha version
- Started to test this
- Need to have some amount of resources available for serious testing
– Current quota 25 VMs and 50 cores is not enough – Need for m1.xlarge flavor (8 cores) – Used up core quota already
ES service ideas
- Feb. 2 2016
15
Issues to address
- ES and Kibana versions
– Many customers rely on kibana 3. We may have to support this somehow
- Flume does not yet work with ES 2.X
– Patch available on github – Started testing this once the test instance is on ES 2.x
- Implementation of QoS on shared instance
– To be done – Waiting for SSD based resources
ES service ideas
- Feb. 2 2016
16
Time lines
- Aiming at initial test instance by March
- Several customers are keen to start testing
– ATLAS needs to move out of their ES resources within 2months – Agreed with them to give them access to a non-prod testing
instance by March
– SDC is keen to move over asap – DB and CDA are ready to test – Batch monitoring is a good candidate as well
ES service ideas
- Feb. 2 2016
17
PaaS
- Checked out how Amazon does it
- Users would own their instances but IT owns the resources
– Create instances giving only few needed parameters – Shared instance could be run as PaaS by us
- Several candidates on the marked
– Started to play with OpenShift
- Upcoming service which will be used for gitlab, jenkings ,...
- Very few steps needed to get an ES instance up
– Other options on the market
- Heat, cloudify, ...
ES service ideas
- Feb. 2 2016
18
Collaboration with external people
- ES meetup at CERN
– Took place on 8/2/2016 at CERN – Close collaboration with ATLAS in organising this event
- Next: ElasticON
– 17-19 Feb. 2016, San Francisco – Pablo will represent us there
ES service ideas
- Feb. 2 2016
19
Conclusions
- An ElasticSearch centralised service is being setup
– Consolidation of existing instances is overdue – Lots of interest from IT and the experiments
- Aiming at first test instances in Q2 2016
- Aiming at a production ready service by Q4 2016
- Looking also into a PaaS options