Building enterprise-level DevOps workflows with Open Source Software (PART II) : Version control

preface

This article is the second in a series of articles, following “Building an Enterprise-level DevOps Workflow with Open Source software, Part 1: Overview,” which introduces the basic concepts of DevOps and some of its components. In this article, we will introduce Version Control System/VCS (Version Control System). In addition to introducing the basic concepts of Version Control, we will also introduce how to implement a Version Control System using the open source GitLab.

Version control system

Version control system is mainly aimed at the management of code changes in the process of software development, ensuring the functional requirements of code traceability, review, management and so on, and the ultimate goal is maintainability. In the last article, we illustrated some of the problems that can occur without a version management tool, so we won’t cover them in this article. The main problem is that the lack of a version management tool leads to a decrease in maintainability, which leads to all kinds of unpredictable bugs.

The main content of the version includes three parts:

Check-in/check-out Control
Branch and Merge (Branch/Merge)
History

Check in check out control

In terms of check-in, we can think of the synchronization and interaction between code and the VCS Database (figure below).

Check-out is equivalent to synchronizing a copy of the VCS Database code to the VCS Database. If there is a conflict with the local code, it is compared with the local code and processed accordingly (we will discuss merging later).

Check-in is the reverse of check-out, that is, synchronizing local code to a remote version management system. It is actually an update to the code database, which is equivalent to an upgrade. Therefore, VCS requires a version review of the code to be updated. It also refuses to update illegal code upgrades (such as non-recent upgrades) to the database.

The check-in and check-out setup ensures that atomic operations in the code database will not conflict if two people submit code at the same time. Code changes are usually made locally, and after each change, the latest code is committed to the code database and the code repository is updated to ensure that the remote code is up to date.

Branch and merge

The concept of a Branch is that code from the same “ancestor” changes in different directions without interfering with each other. Iterations of code from different branches may evolve into completely different functions and structures, albeit from the same “ancestor” node.

A branch is a copy of a node, and it can grow into a new version of itself. The concept of branching in version control system is to facilitate the development of multiple features to prevent mutual interference, which is a way to solve code conflicts.

For example, A development team needs to develop function A and function B, both of which need to be modified on file M. If you develop both function A and function B at the same time, it can be difficult to coordinate because the developer needs to work on file M at the same time. It’s like asking two monkeys to eat the same banana, so why don’t we split two bananas and each monkey eats one?

Therefore, we created two branches, A and B, which were developed separately by two developers, without any interference or interference, so that the operation did not cause code conflicts, and the development was very harmonious.

Although we solved the problem of modifying a file at the same time, we also needed to Merge the two developed functions, which gave rise to the concept of Merge.

As shown in the figure above, when function A and function B are developed, they are version A2 and version B2 respectively, and the two need to be integrated together to produce A new version M2. This integration process is called merger.

Of course, there is Conflict in merging, where both branches change the same piece of code at the same time. A normal version control system (such as Git or SVN) will try to merge code automatically, such as obvious additions and deletions. However, there are situations where automatic merging is not possible, and developers are often required to merge code manually, called Resolve conflicts.

In the development process, there are many strategies for branch management. Generally, the development team will choose the appropriate branch management strategy according to the needs of the project and the situation of the team, which will be discussed later.

The historical record

This is pretty straightforward. All changes to the code, What was changed, Where changed, When changed, Who committed the change, and Why the comment was submitted, are reflected in the History.

If something goes wrong with your code, or you need to refer to the history feature, you can go back to the history to find out why the Bug happened, understand the design of the history code, and so on. This helps the subsequent jie (PAN) xia to master the code that needs to be managed more easily.

Branch Management Strategy

Earlier we focused on branching and merging, and here we will take a look at the branch management strategies that derive from it, which are very important for daily development because different branch strategies have a profound impact on project development.

A branch management strategy can be thought of as a development pattern: how team members work together through branch and merge operations, how different functions are integrated, how development test environments are isolated, how changes are made to production environments once they come online, and so on.

Here are some common branch management strategies.

Trunk Based Development

Trunk Based Development, or TBD for short, is a common branch management strategy and one often adopted by large Internet companies like Google and Facebook.

Trunk development requires all code to be committed to the Master branch, thereby preventing developers from seeing outdated code. Only when a Release is required does the trunk create a branch of the current node for Release.

For trunk development, the developer requires that the latest code be synchronized every day before development, and that any conflicts with newly committed code be resolved locally and recommitted to the trunk. The advantage of this is that since the developer’s native code is basically synchronized with the trunk, there are no major changes when merging, which is relatively easy and doesn’t take a lot of time.

The disadvantages of this approach are also obvious. If many people develop on the project, it will lead to a steady stream of updated code submitted to the trunk, which will lead to a large number of submissions at the time of release, resulting in bugs that are difficult to trace, and then difficult to fix. It is easy to appear that “one piece of shit spoils the whole pot-barrel” situation.

The Git Flow branch management strategy we’ll introduce next addresses this shortcoming.

Git Flow

Git Flow is a Feature Branch strategy. The concept of feature branching strategy is opposite to the concept of trunk development strategy. It means that different functions are developed separately in a branch and then merged into the trunk to ensure that functions do not interfere with each other.

A Git Flow is A feature Branching strategy that Vincent Driessen wrote in A Successful Git Shoot Model (nvie.com/posts/a-suc… A branching model (as shown in the figure below).

Git flows require Master, Develop, Release, and Hotfix branches.

Each time you need to Develop a new Feature, pull a Feature branch from the Develop branch and merge it into the Develop branch. When the Develop branch reaches a certain point, merge it into the Release branch.

The Release branch is a preparatory branch that acts as a BUFFER for UAT tests before the Master branch in production. When the Release branch is ready, it is merged into the Master trunk branch, which is equivalent to a new Release on production.

When there is a Bug in the online version that needs to be fixed, we will pull out a Hotfix branch directly on the Master branch, do the fix directly on the Master branch, and merge it back into the trunk. At the same time, several hotfixes on the Hotfix are merged into a node on Develop to save the fix, so Develop and Master are basically in sync.

Git Flow is a classic Git Flow, but it’s not always the same in practice. For example, a lot of times we don’t really use the Release branch, just Master and Develop, which is more flexible for small to medium sized projects. Release is sometimes called the Test Test branch. There are also variations, such as trunk development combined with parts of Git Flow, trunk development on the Develop branch, and merging it into the Release or Master branch each time it needs to be deployed to production for Release.

Git or SVN?

Git and SVN (Subversion) are two popular version control tools. The biggest difference between the two is that SVN is centralized while Git is distributed (as shown below).

The SVN requires that the code repository has only one central repository. Before submitting the code, all developers must ensure that their local code is fully synchronized with the code in the central repository. Git, on the other hand, is much more flexible. Instead of requiring that all developers’ native code be identical, Git only requires that the pushed repository branch be the same as the local one at commit time.

In addition, SVN branch merging is complex and not robust, because SVN cannot distinguish between manual merging and automatic merging, so a merge record node will not be created. Git, on the other hand, creates a merge record node after a merge, which adds traceability. Git branch

is a very inexpensive way to create branches with Git.

Because of the complexity of branch merging, SVN is generally not suitable for use as a feature branching strategy, but rather as a trunk development pattern (Google App Engine is managed by SVN).

Git fits both the trunk development pattern and the feature branching strategy. As a result, Git is generally the more flexible version control tool, and Git is now the more mainstream choice.

GitLab, an open source tool

GitLab profile

GitLab is an open source version control system, uses Git as a code management tool, and builds Web services on this basis. It has a beautiful Web UI interface, which is convenient for users to operate and use.

GitLab is an open source project written in Ruby with a very liberal MIT copyright that allows for secondary development and commercial use.

GitLab supports Git code repository, permission management, merge request, Issues, Wiki, CI/CD and many other powerful functions.

GitLab is very similar to GitHub in that it supports code repository, code merge, code review and other basic functions. The difference is that GitHub is a cloud product and most projects are open source (private repositories have limitations). GitLab is an open source product that can be easily deployed on any server.

The main reason we used GitLab for our DevOps workflow is its powerful visual interface and permission management, which is equivalent to a beeped-up Git repository. Since human beings are visual animals, they can deal with various complex information more efficiently after visualization. GitLab can help us achieve visualization operation.

For enterprise development, there is usually a need to manage multiple projects, and different developers will be involved in different projects. Therefore, it is important to effectively manage these permissions, and GitLab itself supports related permissions management.

Install GitLab

GitLab installation is very simple, we recommend using container chemical tool Docker to install. If you’re not familiar with Docker, look it up online (I’m sure there are plenty) or check out the upcoming DevOps containers section in this series, where we’ll focus on Docker.

Docker GitLab home page on the Hub: hub.docker.com/r/gitlab/gi…
Docker installation GitLab official tutorial: doc.gitlab.com/omnibus/doc…

Before installing, make sure you have Docker installed on your machine or server and can perform basic Docker operations, such as Docker PS.

sudo docker run --detach \ # --detach means run in the background
  --hostname gitlab.example.com \ The hostname referenced in # GitLab needs to be set to the server domain name
  --publish 443:443 --publish 80:80 --publish 22:22 \ # Mapping port
  --name gitlab \ # container name
  --restart always \ The container restarts automatically when it hangs
  --volume /srv/gitlab/config:/etc/gitlab \ Configure persistence
  --volume /srv/gitlab/logs:/var/log/gitlab \ # Log persistence
  --volume /srv/gitlab/data:/var/opt/gitlab \ # Data persistence
  gitlab/gitlab-ce:latest # mirror name
Copy the code

This is the official Docker startup command, just need to enter the above command in the command line, you can start GitLab.

Wait a few seconds and type http://localhost in your browser to see the GitLab login page.

Using GitLab

Instead of going into all the features of GitLab, we’ll briefly cover the important parts of the DevOps workflow, mainly the version control part of Git: clone repositories, create branches, merge branches. The rest of the functionality is up to the reader to read the official documentation or install the experience themselves.

Cloning of warehouse

In the code project, copy the following SSH or HTTP address, such as SSH: / / localhost/user/project1, in the local input the following command line.

git clone ssh://localhost/user/project1
Copy the code

A directory for the project is then created in the current directory and the files are copied from the remote end. An Origin remote record will be created. You can view it in the following ways.

git remote -v
Copy the code

If you use merge Request for version control, you will also use Git Remote.

Create a branch

We can create branches in the interface, but we can also create them locally.

Create the Develop branch by typing the following in your local project.

git branch develop
git checkout develop

# or

git checkout -b develop
Copy the code

Then change some code and commit it via git commit. Next, push the code to the remote server.

git push origin develop
Copy the code

The remote created a Develop branch, and the commits on the local DEVELOP branch were updated to the server.

Merging branches

When we want to merge the Develop update to the trunk Master branch, we need to merge.

Locally, you can do this.

git checkout master
git merge develop
git push origin master
Copy the code

This completes the trunk merge.

However, sometimes, especially on larger projects, we don’t want to make changes on the Master branch, and we want to force the trunk branch to be updated by merging code. Therefore, we need to add a protection operation. You can set the Protected trunk Branch in Settings -> Repository -> Protected Branch in your project. That is, push to master is not allowed.

If such protection restrictions are made, we need to merge branches by means of merge Request. On the project home page in GitLab, click Merge Requests on the left menu, click New Merge Request, select Source Branch as Develop, Target Branch as Master, Click on Compare Branches and Continue to enter the submit merger request page. On this page, you can see some of the merge information, including what commits were in the merge, what code was changed, and so on. Click Submit Merge Request to create a Merge Request. Then, if you are a privileged role, you can click Merge on the Merge request page and agree to the request, and the master branch is merged.

The Code Review feature added by GitLab when merging branches is very convenient for Code Review. Typically, the developer with the authority to agree to a merge request will be a Reviewer who will review the submitted code prior to the merge, and agree or reject the merge request based on the review results.

conclusion

This article introduces the main concepts of a version control system, including check-in and check-out, branch merging, and history, as well as two branch development strategies (trunk development and Git Flow), and compares Git with SVN. In addition, GitLab, an open source version control system, is introduced, including its concept, installation and basic use. This knowledge will be a very important foundation for the DevOps workflow we will cover later. In the next article, we’ll cover continuous integration (CI), which is relevant to version control systems, as Jenkins or GitLab CI/CD can automatically build products using the GitLab repository as source code. Without a version control system, the benefits of CI are limited. Therefore, the version control system is a very important module in the DevOps workflow.

In future articles, we’ll continue to cover other aspects of DevOp, including continuous integration, containerization, choreography, networking, and how all of this works together to form enterprise-level DevOps workflows. Stay tuned for more.