hadoop infrastructure uber past present and future
play

Hadoop Infrastructure @Uber Past , Present and Future Mayank - PowerPoint PPT Presentation

Hadoop Infrastructure @Uber Past , Present and Future Mayank Bansal U B E R | Data Ubers Mission Transporta=on as reliable as running water , everywhere, for everyone 75+ Countries 500+ Ci=es And growing U B E R | Data How


  1. Hadoop Infrastructure @Uber Past , Present and Future Mayank Bansal U B E R | Data

  2. Uber’s Mission “ Transporta=on as reliable as running water , everywhere, for everyone ” 75+ Countries 500+ Ci=es And growing… U B E R | Data

  3. How Uber works U B E R | Data

  4. How Uber works U B E R | Data

  5. How Uber works U B E R | Data

  6. Data Driven Decisions U B E R | Data

  7. Data Infra Once Upon a 8me.. (2014) Applica=ons ETL EMR S3 Business Ops Kafka Logs A/B Experiments … Adhoc Analytics Vertica Key-Val DB Data Warehouse City Ops Data Science RDBMS DBs U B E R | Data

  8. Data Infrastructure Today Service Accounts ETL Machine Learning Kafka8 Logs Experimenta=on HDFS Data Science … Spark| Presto Hive Adhoc Analytics Schemaless DB Ops/Data Science City Ops Data Science SOA DBs U B E R | Data

  9. Few Takeaways … ● Strict Schema Management ○ Because our largest data audience are SQL Savvy! (1000s of Uber Ops!) ○ SQL = Strict Schema ● Big Data Processing Tools Unlocked - Hive, Presto and Spark ○ Migrate SQL savvy users from Ver=ca to Hive & Presto (1000s of Ops & 100s of data scien=sts & analysts) ○ Spark for more advanced users - 100s of data scien=sts

  10. Hadoop Evolu8on @ ebay Hadoop Evolu8on @ Uber 2016 2015 90X Nodes 40X PB Data 2014 10X Nodes 4X PB Data 1X Nodes 1X PB 3000+ node 30,000+ cores 50+ PB U B E R | Data

  11. Hadoop Cluster U=liza=on • Over provisioning for the peak loads. • Over capacity for an=cipa=on of future growth U B E R | Data

  12. Hadoop Evolu8on @ ebay Mesos Evolu8on @ Uber 2016 2015 300X Nodes X Nodes 2014 0 Nodes U B E R | Data

  13. Mesos Cluster U=liza=on • Over provisioning for the peak loads • Over capacity for an=cipa=on of future growth U B E R | Data

  14. End Goal Online Presto U B E R | Data

  15. What we need ? GLOBAL VIEW OF RESOURCES U B E R | Data

  16. Available Resource Managers U B E R | Data

  17. Mesos vs YARN Scales Beger Similar Isola=on YARN MESOS Single Level Scheduler Two Level Scheduler Disk is Use C groups for isola=on Use C groups for Isola=on beger CPU, Memory as a resource CPU, Memory and Disk as a resource Works well with Hadoop work loads Works well with longer running services YARN support =me based Mesos does not have support of reserva=ons reserva=ons Dominant resource scheduling Scheduling is done by frameworks and depends on case to case basis This is Important Beger for batch Imp for batch SLA’s U B E R | Data

  18. Let’s 8ed them together In a Nutshell YARN is good for Hadoop Mesos is good for Longer Running Services U B E R | Data

  19. U B E R | Data

  20. • Myriad is Mesos Framework for Apache YARN • Mesos manages Data Center resources • YARN manages Hadoop workloads • Myriad • Gets resources from Mesos • Launches Node Managers U B E R | Data

  21. Myriad’s Limita8ons Sta=c Resource Par==oning • YARN will handle resources handed over to it. • Mesos will work on rest of the resources U B E R | Data

  22. Myriad’s Limita8ons Resource Over Subscrip=on • YARN will never be able to do over subscrip=on. • Node Manager will go away • Fragmenta=on of resources • Mesos over subscrip=on can kill YARN too U B E R | Data

  23. Myriad’s Limita8ons • No Global Quota Enforcement • No Global Priori=es U B E R | Data

  24. Myriad’s Limita8ons • Elas=c Resource Management • Bin Packing • Stability • Long List … U B E R | Data

  25. Unified Scheduler U B E R | Data

  26. High Level Characteris8cs • Global Quota Management • Central Scheduling policies • Over subscrip=on for both Online and Batch • Isola=on and bin packing • SLA guarantees at Global Level U B E R | Data

  27. Unified Scheduler U B E R | Data

  28. Few Takeaways … • We need one scheduling layer across all workloads • Par==oning resources are not good • At least can save 30% resources • Stability and simplicity wins in Produc=on • Mul= Level of resource Management and scheduling will not be scalable U B E R | Data

  29. U B E R | Data

  30. Ques=ons? mabansal@uber.com mayank@apache.org U B E R | Data

  31. Thank You !!! U B E R | Data

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend