LARGE SCALE OPEN SOURCE DEVELOPMENT MODELS
A COMPARATIVE ANALYSIS
By Joe Gordon
LARGE SCALE OPEN SOURCE DEVELOPMENT MODELS A COMPARATIVE ANALYSIS - - PowerPoint PPT Presentation
LARGE SCALE OPEN SOURCE DEVELOPMENT MODELS A COMPARATIVE ANALYSIS By Joe Gordon ABOUT ME OpenStack Developer at HP Hacking on OpenStack for 4 years contact information jogo on freenode github.com/jogo WHY Saw OpenStack grow from around 60
By Joe Gordon
OpenStack Developer at HP Hacking on OpenStack for 4 years contact information jogo on freenode github.com/jogo
Saw OpenStack grow from around 60 developers to 2,000 developers Unusual development model But how do other projects solve the same problems?
Linux kernel 2 years to reach 100 contributors in 1991 Linux 2.0 had 190 contributors in 1996 in credits OpenStack took 1 year to reach 100 contributors in 2010 Docker had over 300 contributors in its first year in 2013 200 contributors per month Linux: 1991 - June 2004 (13 years) Debian: 1993 - March 2007 (14 years) OpenStack: 2010 - October 2012 (2 years)
Linux, Debian, Docker, OpenStack (clockwise from top left) source:
Open source instead of standard bodies Balancing corporate interests
Linux foundation Gold and Platinum Members
constrained to produce designs which are copies of the communication structures of these organizations
Linux Kernel Apache Software Foundation Debian OpenStack Docker
Time based release model (2-3 months) Rolling development model, continually integrating major changes Separate stable team Release single artifact Rarely consumed directly by end users
Per month 1,000 contributors 5,000 to 7,000 patches Lieutenants / subsystem maintainers 100-150 maintainers Chain of Trust No elections for technical positions Decentralized review process Each maintainer has their own git tree Only about 1% of patches are directly merged by BDFL
Usually one layer of subsystem maintainer but sometimes up to three
Communication mailing lists Git No automated pre-commit CI Yes post commit Code review decentralized more mailing lists
Process can be quick for minor fixes or take years for controversial changes
Prefer in the open, but not required
Find correct maintainer Submit patches via email
next trees.
Chain of trust About the individual Value frankness over politeness Corporate friendly No single company controls Not much automated pre commit testing Failing testing is very bad for author
ASF is more of a governance umbrella and culture Each project does its own thing 150+ separate releases
Separate projects 4,431 committers 150+ top level projects 740 contributors in past 12 months? In project scaling up to each project Apache Spark had 570 contributors in past 12 months OpenOffice had 31 flat (ish) trust model 'Review then commit' vs. 'commit then review'
In order to reduce friction and allow for diversity to emerge, rather than forcing a monoculture from the top ... each project is delegated authority over development
charter and its own governing rules.
Makes technical decisions
When the group felt that the person had "earned" the merit to be part of the development community, they granted direct access to the code repository, thus increasing the group and increasing the ability of the group to develop the program, and to maintain and develop it more effectively.
Communication mailing list SVN Optional CI Central review system: Lazy consensus Review Board If it didn't happen on a mailing list, it didn't happen.
Different projects have different review flows Review then commit or commit then review. Review Board
Lazy consensus Focus is on the team All decisions are team based Focus is on contributors not companies No monoculture Within the ASF we worry about any community which centers around a few individuals who are working virtually uncontested.
When its ready, not time based. Notoriously slow Every 2 years Lots and lots of artifacts Unstable, Testing, Stable
Package Maintainers 3,200 Debian Developers Can have individual maintainers or groups (via a mailing list) No review, trust/burden maintainers more
Roles Maintainer: the person making the Debian package of the program. Sponsor: a person who helps maintainers to upload packages to the
Debian Developer (DD): a member of the Debian project with full upload rights to the official Debian package archive. Debian Maintainer (DM): a person with limited upload rights to the
Communication Mailing list Web services Lots of IRC Poor automated testing Quality control is ultimately to individual maintainers Half of the CI available isn't official No peer review system Except for new packages (FTP Masters)
Rotating leadership (elections) Do-ocracy: An individual Developer may make any technical or nontechnical decision with regard to their own work Open development Independent not 'profit-driven': no imposed decisions by who has money, infrastructure, people no benevolent dictator, no oligarchy It is all about the individual (although individual's can form groups) Territorial
Time based, every 6 months Continuous delivery Set of separate but related projects. Usually 1 way dependencies Lots of artifacts Sometimes consumed directly by consumers (without distro) No rolling development, freeze development on master before a release
Break down repositories and build teams around each repository 31 teams 150+ repositories 5,000 commits per month from 500 contributors 282 core developers Flat trust model Strong centralized review process (two core reviews) Automated testing to reduce reviewer burden Having trouble with scaling the team responsible for a single repository Can't get past 15 or so members on a core team
Flat as possible
Communication Mailing lists IRC Code reviews Git Code review: Lots and lots of automated testing In person design summits twice a year Gerrit
Group over individual Egalitarian Elections Welcoming to new contributors Corporate friendly Not controlled by single company Lazy consensus Decentralized design Uniform tooling/process across projects
OPENSTACK'S 4 OPENS Open Source, not open core Open Design Open Development Open Community Lazy consensus technical governance is a meritocracy put everything in the public
Cross project issues Team size Single vision
'Github' development model
Every 2 Months separate release branch master isn't frozen
37 maintainers in Docker 10-15 repos in total Maintainers / subsystem maintainers
Have to submit a pull request when going on vacation!
No don't direct push Centralized review in github
1) They share responsibility in the project's success. 2) They have made a longterm, recurring time investment to improve the project. 3) They spend that time doing whatever needs to be done, not necessarily what is the most interesting or fun." This "cellular division" is the primary mechanism for scaling maintenance of the project as it grows.
Ideally, the BDFL role is like the Queen of England: awesome crown, but not an actual operational role daytoday. The real job of a BDFL is to NEVER GO AWAY. ... the BDFL will always be there, preserving the philosophy and principles of the project, and keeping ultimate authority over its fate. This gives us great flexibility in experimenting with various governance models, knowing that we can always press the "reset" button without fear of fragmentation or deadlock. See the US congress for a counterexample. BDFL daily routine: * Is the project governance stuck in a deadlock or irreversibly fragmented? * If yes: refactor the project governance * Are there issues or conflicts escalated by core? * If yes: resolve them * Go back to polishing that crown.
Communication IRC Google groups Pull request (all decisions are a pull request) Git Github CI Jenkins Gate pull requests
5 States of a review
Check DCO etc. Partially automated
Commit message bodies are optional!
Embraces the BDFL Open source Open design Docker the company dominates Most of the maintainers are docker employees Automated testing Be Nice and Encourage diversity and participation
There is no one size fits all solution
Open Source Open Core Open Design Open Development Open Community
Bug tracking Review process Testing Overall workflow DCO vs. CLA Barrier to entry for new contributors
Communication Team scaling model chain of trust model flat trust model maintainers Release cadence Stabilization periods Rolling development CI/CD Number of artifacts Decision making process Consensus model Project culture Individual vs group
BDFL Vision Managing competing interests Corporate Ownership Team vs individual First come first serve? How do you fire someone?
Slides can be found at jogo.github.io Powered by reveal.js
Black (default) White Serif Solarized