keeping master green at scale
play

Keeping Master Green at Scale Sundaram Ananthanarayanan , Masoud - PowerPoint PPT Presentation

Keeping Master Green at Scale Sundaram Ananthanarayanan , Masoud Saeida Ardekani, Denis Haenikel, Balaji Varadarajan, Simon Soriano, Dhaval Patel, Ali-Reza Adl-Tabatabai (https://eng.uber.com/research/keeping-master-green-at-scale/) Monorepo is


  1. Keeping Master Green at Scale Sundaram Ananthanarayanan , Masoud Saeida Ardekani, Denis Haenikel, Balaji Varadarajan, Simon Soriano, Dhaval Patel, Ali-Reza Adl-Tabatabai (https://eng.uber.com/research/keeping-master-green-at-scale/)

  2. Monorepo is popular! • Single, shared repo hosting companies’ software assets Monorepo! Multirepo Advantages of a Monorepo [Ciera et al. @ICSE’18] ✔ Simplified Dependency Management ✔ Improved Code Visibility

  3. Always green master considered hard • Monorepos handle a huge volume of commits every day • Existing CI workflows do not guarantee an always green master ‐ Too hard at scale • Submit Queue guarantees an always-green master at scale

  4. Outline 01 Why green master is hard 02 Probabilistic Speculation 03 Conflict Analyzer 04 Evaluation

  5. Lifecycle of a change in monorepo Monorepo Change Peer Review Developer Revision Developer BUILD ✅ CI Server Change Change TEST Revision Revision RESULT

  6. Challenge: Concurrent conflicting changes Alice Bob C 1 C 2 master

  7. Challenge: Concurrent conflicting changes Alice Bob C 1 C 2 master C 1 C 2 build steps fail

  8. Example of a real conflict

  9. How often conflicts happen?

  10. How often conflicts happen? Observation: Chances of a conflict ↑ from 5% to 40% as #. of concurrent & potentially conflicting changes ↑

  11. Drawbacks of a red master Delayed rollouts Hampered Productivity Complex rollbacks

  12. Keeping master green: Queue Alice C 3 C 2 C 1 Bob Carol master H Alice, Bob, Carol enqueue changes they want to commit

  13. Keeping master green: Queue Alice C 3 C 2 C 1 Bob Carol master H C 1 is built and tested against mainline head (H).

  14. Keeping master green: Queue Alice C 3 C 2 C 1 Bob Carol master H Build steps for H ⊕ C 1 succeed.

  15. Keeping master green: Queue Alice C 3 C 2 Bob Carol master H C 1 is committed and it becomes the head. C 2 is tested against it.

  16. Keeping master green: Queue Alice C 3 C 2 Bob Carol master H H Build steps for H ⊕ C 2 fails and C 2 is rejected.

  17. Keeping master green: Queue Alice C 3 Bob Carol master H H ✔ Guarantees an always green master by serializing changes ✖ Does not scale to 1000s of changes/day

  18. Keeping master green: Batching changes Alice C 3 ` C 2 C 1 Bob Carol master H C 1 and C 2 are batched and build steps are run.

  19. Keeping master green: Batching changes Alice C 3 ` C 2 C 1 Bob Carol master H ✔ Improves the throughput if batches succeed more often than not ✖ Testing batches masks intermediate changes that fail ✖ Batches will fail often as the size of the batch increases What happens when batches fail?

  20. Keeping master green: Goals Guarantee serializability Provide reasonable SLAs • Illusion of a single queue when committing • Overheads should be short enough for changes developers to trade speed for correctness! • Git only offers serializability of patches Challenge: how to do this at scale? (1000s of commits/day)

  21. Submit Queue: Overview Speculation Engine Conflict Analyzer Planner Engine • Speculates on success/failure • Determines independent • Selects most valuable builds of changes changes from speculation engine • Builds speculation graph • Constructs conflict graph • Execute builds and commit changes

  22. Speculation Tree C 3 C 2 C 1 C 1 , C 2 , C 3 - pending changes

  23. Speculation Tree B 1 C 3 C 2 C 1 B 1 : Build Steps for H ⨁ C 1

  24. Speculation Tree B 1 C 3 C 2 C 1 B 1 fails → C 1 rejected B 1 succeeds → C 1 commits B 2 B 1.2 B 2 : Build Steps for H ⨁ C 2 B 1.2 : Build C 2 against (H ⨁ C 1 ) 1. Precompute the outcome of committing C 2 under different realities 2. Commit or reject C 2 based on the outcome of B 1 and one of {B 2 , B 1.2 }

  25. Speculation Tree B 1 C 3 C 2 C 1 B 1 fails → C 1 rejected B 1 succeeds → C 1 commits B 2 B 1.2 B 2 fails → C 2 rejected B 1.2 fails → C 2 rejected B 2 succeeds → C 2 commits B 1.2 succeeds → C 2 commits B 3 B 2.3 B 1.3 B 1.2.3 Challenge: Which builds to run?

  26. Approach #1: Speculate Them All Speculate on all possible outcomes equally C 3 C 2 C 1 ● Selects builds in a breadth-first order B 1 Does not scale for 1000s of changes/day ● Need to run 2 n builds in parallel to commit ‘n’ changes B 2 B 12 B 3 B 23 B 13 B 123 Leads to substantial waste of resources

  27. Speculate Them All: Resource Wastage C 3 C 2 C 1 B 1 B 2 B 12 B 3 B 23 B 13 B 123

  28. Speculate Them All: Observation If we select and execute builds whose outcomes are most likely to be needed , then we require only n (out of 2 n ) builds . Challenge: Which ‘n’ builds are likely to be needed?

  29. Probabilistic Speculation B 1 B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3 represents the prob. the result of the build B C is used to make to commit/reject C.

  30. Probabilistic Speculation B 1 B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3 Root B 1 is always needed as is used to determine if C 1 can be committed

  31. Probabilistic Speculation B 1 B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3 represents the prob. that change C 1 succeeds individually

  32. Probabilistic Speculation B 1 B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3 represents the prob. that change C 1 succeeds individually

  33. Probabilistic Speculation B 1 B 1.2 : Build C 2 against (H ⨁ C 1 ) B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3 represents the prob. that C 2 conflicts with C 1

  34. Probabilistic Speculation B 1 B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3

  35. Probabilistic Speculation: Summary Choose most valuable builds by determining • Probability of success of a change • Probability of a conflict bet. changes B 1 B 2 B 12 B 3 B 23 B 13 B 123

  36. Evaluating and Logistic regression to train prediction models ● ○ Feature set includes 100+ hand-picked features ○ Prediction accuracy of 97% Change Developer Speculation ● # affected targets ● developer name ● dynamic features to re-adjust ● # git commits ● employment proficiencies weights based on initial predictions ● # files changed ● # speculations succeeded ● status of pre-submit checks ● # speculations failed

  37. Features for Training ML Models Change Revision ● # affected targets ● revision is a container for changes ● # git commits ● # changes submitted ● # files changed ● revert and test plans ● status of pre-submit checks ● # Submit attempts made Developer Speculation ● developer name ● dynamic features to re-adjust weights ● employment proficiencies based on initial predictions ● # speculations succeeded ● # speculations failed

  38. Conflict Analyzer ● So far, we assumed all changes potentially conflict with each other ○ Cannot commit in parallel ● What if changes can be proved to be independent? ○ Commit changes in parallel ○ Trim speculation space ● We use Conflict Analyzer to find independent changes

  39. Conflict Analyzer: Commit Changes in Parallel C 1 C 2 C 3 Conflict graph for changes C 1 , C 2 , C 3 where C 1 and C 2 are independent and conflict with C 3 .

  40. Conflict Analyzer: Commit Changes in Parallel B 1 B 2 C 1 C 2 B 1 succeeds B 2 fails B 1 fails B 2 succeeds B 1.3 C 3 B 3 B 2.3 B 1.2.3 Insight: Changes C 1 and C 2 can be committed in parallel.

  41. Conflict Analyzer: Trim Speculation Space C 1 C 2 C 3 Conflict graph for C 1 , C 2 , C 3 where C 1 conflicts with independent changes C 2 and C 3 .

  42. Conflict Analyzer: Trim Speculation Space B 1 C 1 B 1 fails B 1 succeeds B 3 B 1.2 B 1.3 C 2 C 3 B 2 Insight: Because C 3 does not speculate on C 2 , # of possible builds for C 3 reduces to 2.

  43. Conflict Analyzer: Detecting conflicts at scale Build system to detect if changes are ● T 1 main.exe independent Code partitioned into smaller entities ● T 2 T 3 util.o main.o called targets Every change affects a set of targets ● util.c util.h main.c Example build graph

  44. Detecting Conflicts: Intuition Two changes are independent if they affect a disjoint set of targets.

  45. Detecting Conflicts: Build Graph for H ⊕ C 1 Example Target Y Target Z Applying C 1 Original Build Graph for H Target X Target Y Target Z Applying C 2 Build Graph for H ⊕ C 2 Target X Target Y Target Z Target X

  46. Detecting Conflicts: Build Graph for H ⊕ C 1 Puzzle Target Y Target Z Applying C 1 Original Build Graph for H Target X Target Y Target Z Applying C 2 Build Graph for H ⊕ C 2 Target X Target Y Target Z ● C 1 and C 2 are conflicting ● But, the intersection of affected targets is empty! Target X

  47. Detecting Conflicts: Build Graph for H ⊕ C 1 Composition 5 Target Y 3 Target Z {(x, 4), (y, 5)} Applying C 1 4 Target X Original Build Graph for H Build Graph for H ⊕ C 2 2 3 Target Y Target Z Applying C 2 2 Target Y 6 Target Z {(z, 6)} Applying C 1 ⊕ C 2 1 Target X 1 Target X Build Graph for H ⊕ C 1 ⊕ C 2 5 Target Y 7 Target Z {(x, 4), (y, 5)} ∪ {(z, 6)} ≠ {(x, 4), (y, 5), (z, 7)} {(x, 4), (y, 5), Thus, C 1 and C 2 are conflicting! (z,7)} 4 Target X

  48. Detecting Conflicts: Summary • Intersection Approach ✖ Does not detect all kinds of conflicts • Union Approach ✖ Determining conflicts for n changes requires n 2 build graphs! • Hybrid Approach ✔ Only 7.9% of changes cause a change to the build graph • Union Graph Approach (details in paper)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend