Summary: The essence of team r&d is not that the bigger the team, the more efficient the r&d. We thought that the larger the team size, the higher the R&D efficiency, the more things can be done, but we found that when the team size reaches a certain point, the overall R&D efficiency decreases, even very quickly. Why is it that as the team gets bigger and bigger, our releases get slower and slower?

Planning & Editor | Ya Chun

The nature of team development

We once came into contact with an enterprise with only 8 employees at the beginning. At that time, one or two versions could be sent out every month, which could be used by customers, because they were engaged in the information management HIS system of hospitals. They thought they did a good job. Later, the team developed rapidly and reached about 80 people, but no version was released for half a year. This caused the implementation team to lose face with the customer, because the customer said how can not send out the requirements six months ago.

Here’s the paradox: We think that the bigger the team, the more efficient the R&D will be, the more things we can do, but we find that when the team gets bigger, the overall r&d efficiency drops, even very quickly.

From the perspective of teams, collaboration is getting slower and more expensive because there are more people. We found that the more people you have, the more problems you have, the more conflicts you have, the more waiting you have. Conflicts refer to conflicts during the code integration release process, while waits are also waits for each other during the integration and development process.

Here are two specific scenarios.

Suppose you have two programmers, A and B, working together. A starts by successfully committing work to the line with each commit, and THEN B commits A version, causing the build to fail. In this case, A cannot submit because the submission will hang and wait for B to fix the problem before submitting. In this case, A’s submission conflicts with B’s work.

In the second case, multiple branches merged into the same branch, with FeatureB merging into the trunk first, while FeatureB was a little late, but it was not possible to merge because the baselines were different, and code conflicts had to be resolved before merging.

Pictured above, suppose now have three people, A and B and test C, everybody do dot represents it tasks, such as A have been doing their own things, each completed A things started to do the next, finished the third thing when he felt the need to find alignment, the B will sent B A ping, but B has its own rhythm, in favor of his own task, Therefore, IT did not immediately respond to A’s request. He found that there was a task that could be tested, so he told C, C found that there was a problem and Pong went back immediately, but B was busy with another task at that time and did not respond. C found that B did not respond and sent Pong again. At this moment, B saw the messages of A and C. He first dealt with A’s affairs and sent A reply to A about Pong.

We found that programmer to programmer, tester to developer, in the whole development collaboration is actually asynchronous, delayed collaboration process. Everyone does not immediately respond to a request, and immediately collaborate, often at their own pace and their own actions, which may cause delays. So as the product gets more complex, there’s more collaboration, the team gets more complex, the team gets more people, the cost of collaboration goes up very quickly.

In such an asynchronous, delayed collaboration process, programmers face daily development work, need a set of corresponding research and development mode, to ensure that in the process of collaboration can continue to synchronize information, and quickly respond to drop.

The software delivery process is essentially a collaborative process of developers around the code base. Production code, configuration, environment, and release process can all be described in code and stored in the code base.

Thus, the purpose of the development pattern is to constrain our behavior as we work around the code base, essentially a behavior constraint around the code base.

The R&D pattern, narrowly understood as the branching pattern, contains a series of behavioral constraints, such as the type of branch and its identification, the life cycle of the branch, the flow of Commit between branches, the flow constraints, and the corresponding relationship between the branch and the code. We’ll talk about them all.

Research and development mode is a series of research and development behavior constraints, the goal is to avoid conflict, reduce waiting. In the process of collaboration, the biggest problem brought by more people is more conflicts and more waiting. Therefore, a good RESEARCH and development mode should avoid conflicts and reduce waiting as much as possible.

First, let’s look at the correspondence between R&D mode and R&D behavior.

There is a Mapping between these development actions and code base actions. When we start new feature development, we create a new feature branch. Do a Commit integration, which is basically a Commit and Push integration, and then go into the integration validation, and do a branch Merge.

Similarly, merging into release-pending is a Merge, and completing the release means typing a Tag. The actions of the code base record our development actions, so the development actions and the actions of the code base can be mapped one by one.

The only way to avoid conflict is to separate people from each other. In the code base, many times by means of branches, to do work between the isolation, to avoid conflicts.

To reduce waiting, which is caused by information being out of sync, synchronize information as much as possible, so there is no waiting. Waiting within code is the synchronization of baselines between code, such as frequent commits. So branches are used to avoid conflicts and do work isolation, while frequent commit merges are used to synchronize information and reduce waiting.

Q: If one person is doing software development, what is the branching model? Can a person have conflict?

There is no conflict when one person is doing software development, one person does not need many branches, just one branch is enough. You don’t have to wait for information, so you can go all the way. But if you have 10 people, 100 people, you have segregation of work, you have conflict, you have waiting. So in this process, as more and more people collaborate, the branching pattern changes constantly.

Four common branching patterns are analyzed

The main development

When the team is small (say 1 or 2 people), the most common development model is trunk-based development, also known as Trunk development.

Trunk development means that a single trunk branch goes all the way to the end, so there is not much conflict in the development process and the code is required to be continuously integrated into the trunk, so there is no need to do the corresponding work of isolation during development. During development, all developers commit frequently and integrate frequently on the trunk. In this branching mode, the only fork occurs at release time, and release branches are created to isolate release versions.

In this mode, there is no need to do branch isolation, and information synchronization is guaranteed by continuous and frequent commits. When the number of people is relatively small and the overall engineering capacity is relatively strong, this is our recommended research and development mode.

However, as more and more people participate in the development, the conflict rate of trunk development will greatly increase, and the requirement of engineering ability will become higher and higher.

So trunk development is not a panacea; the more people on the trunk, the greater the chance of code being submitted for conflicts, and the greater the risk of conflict resolution. If you have two people, even if you have a conflict and I know it’s just with one other person, if you have 10 people, it creates a lot of problems.

In addition, in the trunk development, to keep information in sync, you need to do frequent and continuous commits, and the strength of each commit should be very small. For some features, it may only be half done, then it needs to be submitted, and it needs to be isolated by means of feature switches. For example, if this is an unfinished feature, make its switch Off in advance and commit accordingly, but the feature switch is essentially a branch.

A feature switch simply pulls a branch in code, but the branch can only be reached when opened, essentially a branch. If there are too many feature switches, it can make your code brittle and difficult to maintain.

Trunk development when many people are involved at the same time, there is a lot of potential for code conflicts, and there is a lot of risk in feature development that people need to be isolated from each other.

Git-Flow

The basic principle of Git-flow is to give whatever branch you need, and have a clear branch for everything. For integration, there’s the Develop branch, there’s the feature branch, and there’s the Release branch, each of which is a different branch. Each type of branch has a defined purpose.

For example, feature branch is used for work isolation when many features are developed in parallel to avoid conflicts between them. The Release branch is used to isolate releases so that there are no conflicts between releases.

We found that this pattern does a good job of isolating, but it requires frequent develop integrations to synchronize information, and cherry-pick or rebase between branches to do so.

At this point, we will find that there are too many branches, and a commit will go through several branches from feature development to final release, where branch flow and merge rules are very troublesome.

So git-flow is not a panacea, because too many branches add complexity to branch management. In addition, if the life cycle of Feature branch is extremely long, its merging time will also become very long. The Develop branch and the Master branch exist at the same time, as if the Develop branch doesn’t make much sense. In addition, the distinction between Feature branch and hotfix does not seem to be particularly meaningful.

So while git-flow adds a lot of branches to keep things as isolated as possible, it’s cumbersome to synchronize information, and it’s extremely difficult to manage those branches.

GitHub-Flow

GitHub has introduced a branching pattern called GitHub -flow, which is significantly simpler than Git -flow. There is no Develop, no hotfix, no Release. Pull a Feature branch when you need to Develop it and merge it with Master.

In this process, its isolation occurs only during development, and its information synchronization is achieved by continuously integrating with the Master and frequently pulling code from the Master. Its publishing process is based on the trunk Master branch, so there is no isolation during publishing.

Another problem is that the Master branch needs to do continuous integration, which is both the place to integrate and the place to publish. Once the integration goes wrong, it blocks up all the work and can’t be released or merged.

So GitHub — Flow is simple and can be isolated accordingly, but it limits how often you can integrate and publish if your infrastructure or engineering capabilities are weak.

GitLab-Flow

The difference between Gitlab-flow and GitHub -flow is that there are pre-production branches and production branches in the release process, and corresponding branches are allocated based on different environments in the development, integration and release processes.

Once the integration is complete on the Master branch, the next step will switch to the pre-sent branch. The version that corresponds to Commit has reached the pre-release condition. After verification on pre-release, it is synchronized to the Production branch, indicating that it has reached the pre-release condition. Therefore, it is a process of Promotion step by step. Step from integrated environment Promotion to pre-release environment, and then Promotion to production environment.

We’ve taken a quick look at some of the common branching patterns, and let’s compare them against each other.

The pros and cons of common branching patterns

TBD has few branches, is simple to implement and does not require much understanding cost. But it requires a lot of maturity and discipline for team collaboration, and if someone doesn’t follow the discipline, the backbone becomes a nightmare, and it’s hard to do continuous integration and release. Once it fails, everyone is blocked, which is the advantage and disadvantage of the trunk approach.

Git-flow features can be developed in parallel, with perfect rules and clear responsibilities for each branch. No matter how big the team collaboration is, there are basically no problems. However, there are too many branches, rules are too complex, and branch life cycle is long, and merge conflicts are frequent. Develop, Master in particular, is long term.

For GitHub — Flow, Git — Flow can support almost everything it supports, but there is a problem. It only integrates in the Master branch, so it has high requirements for integration discipline. Moreover, integration and publishing are on the same branch, and if the integration branch breaks, both integration and publishing will be interrupted.

Gitlab-flow is also developed in parallel, but the development branch still has the problem of long life cycles and the risk of merge conflicts. In addition, there is coupling between the release branches, such as Prodution and pre-Prodution, which are coupled based on Promotion, so it is also a way to interrupt and block each other. Moreover, many development branches, Prodution and pre-Prodution, It also increases the complexity of branch management.

Therefore, we find that no branching pattern is absolutely good and no branching pattern is absolutely bad.

There is a simple principle for branching, which is to control the number of branches and integrate frequently in small batches. Controlling the number of branches means isolating work, but adding too much administrative cost. Small batch and frequent integration can speed up information synchronization.

So a simple rule of thumb is to keep the number of branches and the number of frequent integrations in small batches as low as possible to maximize productivity and minimize risk.

Maximize productivity: Everyone works in a common area. There are no branches except for a long, uninterrupted development backbone. There are no other rules, and the code submission process is fairly simple. However, each code submission may break the integration of the entire project, which in turn leads to the interruption of the project schedule.

Minimize risk: Everyone works on their own branch. Everyone’s work is independent of each other, and no one can interrupt anyone else’s work, thus reducing the risk of development interruptions. However, this adds an additional process burden, and collaboration becomes so difficult that everyone has to carefully merge their code, even in a very small part of the overall system.

So how do you design or choose a branching pattern that works for you? In the next article, we will continue to share how different teams choose the right development model for them.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.