Keeping Master Green at Scale Sundaram Ananthanarayanan , Masoud - PowerPoint PPT Presentation

Keeping Master Green at Scale Sundaram Ananthanarayanan , Masoud Saeida Ardekani, Denis Haenikel, Balaji Varadarajan, Simon Soriano, Dhaval Patel, Ali-Reza Adl-Tabatabai (https://eng.uber.com/research/keeping-master-green-at-scale/)

Monorepo is popular! • Single, shared repo hosting companies’ software assets Monorepo! Multirepo Advantages of a Monorepo [Ciera et al. @ICSE’18] ✔ Simplified Dependency Management ✔ Improved Code Visibility

Always green master considered hard • Monorepos handle a huge volume of commits every day • Existing CI workflows do not guarantee an always green master ‐ Too hard at scale • Submit Queue guarantees an always-green master at scale

Outline 01 Why green master is hard 02 Probabilistic Speculation 03 Conflict Analyzer 04 Evaluation

Lifecycle of a change in monorepo Monorepo Change Peer Review Developer Revision Developer BUILD ✅ CI Server Change Change TEST Revision Revision RESULT

Challenge: Concurrent conflicting changes Alice Bob C 1 C 2 master

Challenge: Concurrent conflicting changes Alice Bob C 1 C 2 master C 1 C 2 build steps fail

Example of a real conflict

How often conflicts happen?

How often conflicts happen? Observation: Chances of a conflict ↑ from 5% to 40% as #. of concurrent & potentially conflicting changes ↑

Drawbacks of a red master Delayed rollouts Hampered Productivity Complex rollbacks

Keeping master green: Queue Alice C 3 C 2 C 1 Bob Carol master H Alice, Bob, Carol enqueue changes they want to commit

Keeping master green: Queue Alice C 3 C 2 C 1 Bob Carol master H C 1 is built and tested against mainline head (H).

Keeping master green: Queue Alice C 3 C 2 C 1 Bob Carol master H Build steps for H ⊕ C 1 succeed.

Keeping master green: Queue Alice C 3 C 2 Bob Carol master H C 1 is committed and it becomes the head. C 2 is tested against it.

Keeping master green: Queue Alice C 3 C 2 Bob Carol master H H Build steps for H ⊕ C 2 fails and C 2 is rejected.

Keeping master green: Queue Alice C 3 Bob Carol master H H ✔ Guarantees an always green master by serializing changes ✖ Does not scale to 1000s of changes/day

Keeping master green: Batching changes Alice C 3 ` C 2 C 1 Bob Carol master H C 1 and C 2 are batched and build steps are run.

Keeping master green: Batching changes Alice C 3 ` C 2 C 1 Bob Carol master H ✔ Improves the throughput if batches succeed more often than not ✖ Testing batches masks intermediate changes that fail ✖ Batches will fail often as the size of the batch increases What happens when batches fail?

Keeping master green: Goals Guarantee serializability Provide reasonable SLAs • Illusion of a single queue when committing • Overheads should be short enough for changes developers to trade speed for correctness! • Git only offers serializability of patches Challenge: how to do this at scale? (1000s of commits/day)

Submit Queue: Overview Speculation Engine Conflict Analyzer Planner Engine • Speculates on success/failure • Determines independent • Selects most valuable builds of changes changes from speculation engine • Builds speculation graph • Constructs conflict graph • Execute builds and commit changes

Speculation Tree C 3 C 2 C 1 C 1 , C 2 , C 3 - pending changes

Speculation Tree B 1 C 3 C 2 C 1 B 1 : Build Steps for H ⨁ C 1

Speculation Tree B 1 C 3 C 2 C 1 B 1 fails → C 1 rejected B 1 succeeds → C 1 commits B 2 B 1.2 B 2 : Build Steps for H ⨁ C 2 B 1.2 : Build C 2 against (H ⨁ C 1 ) 1. Precompute the outcome of committing C 2 under different realities 2. Commit or reject C 2 based on the outcome of B 1 and one of {B 2 , B 1.2 }

Speculation Tree B 1 C 3 C 2 C 1 B 1 fails → C 1 rejected B 1 succeeds → C 1 commits B 2 B 1.2 B 2 fails → C 2 rejected B 1.2 fails → C 2 rejected B 2 succeeds → C 2 commits B 1.2 succeeds → C 2 commits B 3 B 2.3 B 1.3 B 1.2.3 Challenge: Which builds to run?

Approach #1: Speculate Them All Speculate on all possible outcomes equally C 3 C 2 C 1 ● Selects builds in a breadth-first order B 1 Does not scale for 1000s of changes/day ● Need to run 2 n builds in parallel to commit ‘n’ changes B 2 B 12 B 3 B 23 B 13 B 123 Leads to substantial waste of resources

Speculate Them All: Resource Wastage C 3 C 2 C 1 B 1 B 2 B 12 B 3 B 23 B 13 B 123

Speculate Them All: Observation If we select and execute builds whose outcomes are most likely to be needed , then we require only n (out of 2 n ) builds . Challenge: Which ‘n’ builds are likely to be needed?

Probabilistic Speculation B 1 B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3 represents the prob. the result of the build B C is used to make to commit/reject C.

Probabilistic Speculation B 1 B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3 Root B 1 is always needed as is used to determine if C 1 can be committed

Probabilistic Speculation B 1 B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3 represents the prob. that change C 1 succeeds individually

Probabilistic Speculation B 1 B 1.2 : Build C 2 against (H ⨁ C 1 ) B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3 represents the prob. that C 2 conflicts with C 1

Probabilistic Speculation B 1 B 2 B 1.2 B 3 B 2.3 B 1.3 B 1.2.3

Probabilistic Speculation: Summary Choose most valuable builds by determining • Probability of success of a change • Probability of a conflict bet. changes B 1 B 2 B 12 B 3 B 23 B 13 B 123

Evaluating and Logistic regression to train prediction models ● ○ Feature set includes 100+ hand-picked features ○ Prediction accuracy of 97% Change Developer Speculation ● # affected targets ● developer name ● dynamic features to re-adjust ● # git commits ● employment proficiencies weights based on initial predictions ● # files changed ● # speculations succeeded ● status of pre-submit checks ● # speculations failed

Features for Training ML Models Change Revision ● # affected targets ● revision is a container for changes ● # git commits ● # changes submitted ● # files changed ● revert and test plans ● status of pre-submit checks ● # Submit attempts made Developer Speculation ● developer name ● dynamic features to re-adjust weights ● employment proficiencies based on initial predictions ● # speculations succeeded ● # speculations failed

Conflict Analyzer ● So far, we assumed all changes potentially conflict with each other ○ Cannot commit in parallel ● What if changes can be proved to be independent? ○ Commit changes in parallel ○ Trim speculation space ● We use Conflict Analyzer to find independent changes

Conflict Analyzer: Commit Changes in Parallel C 1 C 2 C 3 Conflict graph for changes C 1 , C 2 , C 3 where C 1 and C 2 are independent and conflict with C 3 .

Conflict Analyzer: Commit Changes in Parallel B 1 B 2 C 1 C 2 B 1 succeeds B 2 fails B 1 fails B 2 succeeds B 1.3 C 3 B 3 B 2.3 B 1.2.3 Insight: Changes C 1 and C 2 can be committed in parallel.

Conflict Analyzer: Trim Speculation Space C 1 C 2 C 3 Conflict graph for C 1 , C 2 , C 3 where C 1 conflicts with independent changes C 2 and C 3 .

Conflict Analyzer: Trim Speculation Space B 1 C 1 B 1 fails B 1 succeeds B 3 B 1.2 B 1.3 C 2 C 3 B 2 Insight: Because C 3 does not speculate on C 2 , # of possible builds for C 3 reduces to 2.

Conflict Analyzer: Detecting conflicts at scale Build system to detect if changes are ● T 1 main.exe independent Code partitioned into smaller entities ● T 2 T 3 util.o main.o called targets Every change affects a set of targets ● util.c util.h main.c Example build graph

Detecting Conflicts: Intuition Two changes are independent if they affect a disjoint set of targets.

Detecting Conflicts: Build Graph for H ⊕ C 1 Example Target Y Target Z Applying C 1 Original Build Graph for H Target X Target Y Target Z Applying C 2 Build Graph for H ⊕ C 2 Target X Target Y Target Z Target X

Detecting Conflicts: Build Graph for H ⊕ C 1 Puzzle Target Y Target Z Applying C 1 Original Build Graph for H Target X Target Y Target Z Applying C 2 Build Graph for H ⊕ C 2 Target X Target Y Target Z ● C 1 and C 2 are conflicting ● But, the intersection of affected targets is empty! Target X

Detecting Conflicts: Build Graph for H ⊕ C 1 Composition 5 Target Y 3 Target Z {(x, 4), (y, 5)} Applying C 1 4 Target X Original Build Graph for H Build Graph for H ⊕ C 2 2 3 Target Y Target Z Applying C 2 2 Target Y 6 Target Z {(z, 6)} Applying C 1 ⊕ C 2 1 Target X 1 Target X Build Graph for H ⊕ C 1 ⊕ C 2 5 Target Y 7 Target Z {(x, 4), (y, 5)} ∪ {(z, 6)} ≠ {(x, 4), (y, 5), (z, 7)} {(x, 4), (y, 5), Thus, C 1 and C 2 are conflicting! (z,7)} 4 Target X

Detecting Conflicts: Summary • Intersection Approach ✖ Does not detect all kinds of conflicts • Union Approach ✖ Determining conflicts for n changes requires n 2 build graphs! • Hybrid Approach ✔ Only 7.9% of changes cause a change to the build graph • Union Graph Approach (details in paper)

Keeping Master Green at Scale Sundaram Ananthanarayanan , Masoud - PowerPoint PPT Presentation

Keeping Master Green at Scale Sundaram Ananthanarayanan , Masoud Saeida Ardekani, Denis Haenikel, Balaji Varadarajan, Simon Soriano, Dhaval Patel, Ali-Reza Adl-Tabatabai (https://eng.uber.com/research/keeping-master-green-at-scale/) Monorepo is

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

Click to edit Master title style DRVR Click to edit Master title style Click to edit Master

Click to edit Master title style Click to edit Master title style Click to edit Master title

Connecticut Green Bank Green Bank 2.0 Green Bonds US Maine Green Bank Summit June 25, 2020

MASTER 2 Research MASTER 2 Research Fragrances & Fine Chemistry Fragrances & Fine

Master Plan Open House #3 Preferred Alternative Master Plan Master Plan Process What is a

Rebasing MadeSimple TwoBranches spaghetti master Merge spaghetti master Rebase

Ameren Keeping Current and Keeping Cooling Evaluation Presentation 2016 Evaluation Activities

The Jo Job Keeping Pla lan A TRAINING FOR RESIDENTIAL PROVIDERS Job Keeping Plan Training

Its Not Its Not Easy Being Green: Easy Being Green: Green Screen as Green Screen as

Download the brief at www.nahb.org/smr 2020 Green SmartMarket Surveys Green Building Market

Clean and Green John Schram Comstock Clean and Green Free Clean and Green Disposal Dates for

What is Green? What does it mean to be green? Why is being green important?

Green Jobs, Decent Work and Sustainable Development Ana Sanchez Green Jobs Programme Green Jobs

The Green Deal Tracy Vegro Director, Green Deal Contents 1. Introducing the Green Deal 2. ECO

Session 1 2020 CPA Information Session Master of Accounting (CPA Program)/Master of Professional

Welcome to GESPS Primary 1 Meet-The-Parents Session 6 Jan 2020 Sequence Of Events For Today

MOL2NET, 2018 , 4, http://sciforum.net/conference/mol2net-04 2 However, this is not a trivial

Portfolio Optimization # 2 A. Charpentier (Universit de Rennes 1) Universit de Rennes 1,

EigenvaluesofLvy CovariationMatrices Random matrix models for datasets with fixed time horizons

Agent-Environment Interface Markov Decision Processes, Dynamic Programming, and Reinforcement

Master of Public Health Graduation Calendar and Approval to Schedule Final Exam Graduation and

Material Handling Tools for a Discrete Manufacturing System: A Comparison of Optimization and

Planning & Managing Migrations Aimee Degnan & Ryan Weal Planning & Managing

Keeping Master Green at Scale Sundaram Ananthanarayanan , Masoud - PowerPoint PPT Presentation

Keeping Master Green at Scale Sundaram Ananthanarayanan , Masoud Saeida Ardekani, Denis Haenikel, Balaji Varadarajan, Simon Soriano, Dhaval Patel, Ali-Reza Adl-Tabatabai (https://eng.uber.com/research/keeping-master-green-at-scale/) Monorepo is

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

Click to edit Master title style DRVR Click to edit Master title style Click to edit Master

Click to edit Master title style Click to edit Master title style Click to edit Master title

Connecticut Green Bank Green Bank 2.0 Green Bonds US Maine Green Bank Summit June 25, 2020

MASTER 2 Research MASTER 2 Research Fragrances &amp; Fine Chemistry Fragrances &amp; Fine

Master Plan Open House #3 Preferred Alternative Master Plan Master Plan Process What is a

Rebasing MadeSimple TwoBranches spaghetti master Merge spaghetti master Rebase

Ameren Keeping Current and Keeping Cooling Evaluation Presentation 2016 Evaluation Activities

The Jo Job Keeping Pla lan A TRAINING FOR RESIDENTIAL PROVIDERS Job Keeping Plan Training

Its Not Its Not Easy Being Green: Easy Being Green: Green Screen as Green Screen as

Download the brief at www.nahb.org/smr 2020 Green SmartMarket Surveys Green Building Market

Clean and Green John Schram Comstock Clean and Green Free Clean and Green Disposal Dates for

What is Green? What does it mean to be green? Why is being green important?

Green Jobs, Decent Work and Sustainable Development Ana Sanchez Green Jobs Programme Green Jobs

The Green Deal Tracy Vegro Director, Green Deal Contents 1. Introducing the Green Deal 2. ECO

Session 1 2020 CPA Information Session Master of Accounting (CPA Program)/Master of Professional

Welcome to GESPS Primary 1 Meet-The-Parents Session 6 Jan 2020 Sequence Of Events For Today

MOL2NET, 2018 , 4, http://sciforum.net/conference/mol2net-04 2 However, this is not a trivial

Portfolio Optimization # 2 A. Charpentier (Universit de Rennes 1) Universit de Rennes 1,

EigenvaluesofLvy CovariationMatrices Random matrix models for datasets with fixed time horizons

Agent-Environment Interface Markov Decision Processes, Dynamic Programming, and Reinforcement

Master of Public Health Graduation Calendar and Approval to Schedule Final Exam Graduation and

Material Handling Tools for a Discrete Manufacturing System: A Comparison of Optimization and

Planning &amp; Managing Migrations Aimee Degnan &amp; Ryan Weal Planning &amp; Managing

MASTER 2 Research MASTER 2 Research Fragrances & Fine Chemistry Fragrances & Fine

Planning & Managing Migrations Aimee Degnan & Ryan Weal Planning & Managing