Git bisect command resolution #1: Basics & simple case
The introduction
1. Bisect: “To take… “Bisection method” is a binary search method.
The main article in this series, a utility command in git, uses this word as the command name (git bisect).
What bisect means
- There is an interesting saying on the Internet: “The life of an engineer can be summarized into two main categories: bug writing and debug”.
- This is meant as a joke, but we do get a sense of the deep malevolence and harm that comes from the many different forms of software bugs in our daily work, whether they are created by others or by ourselves.
- Debug is indeed a sacred and regular daily activity in our work.
Debugging is simply:
- Identify bug phenomena and recurrence rules
- Sort out and control bug areas
- Troubleshoot and verify to find the root cause
- Propose solutions and fix them
- Verify, and make up some automated tests as needed
Comb, and the control range is particularly important one link, because each development of students life and energy are limited, the debug don’t consider first comb it “without thinking can literally change things look at problem solving”, do it like a headless fly into any code snippets, so probably burn the precious youth, but no harvest.
As an obvious example, “If A bug appears in module A, you will not look at the code in module B first.”
So, how do you quickly narrow it down to the relevant problem code?
Let us begin with a conviction:
“Certain bugs must have started in a certain place at a certain time.” (Of course, by “location” I mean not just code, but upstream and downstream, environment, etc.).
The reason for maintaining this belief is to avoid being tempted to compromise solutions due to temporary setbacks in debug.
A very common class of bugs, known as “regression bugs,” fits particularly well with the above statement of “appearing in a specific place at a specific time.”
- We can use the following two characteristics to roughly identify a bug as a “regression defect” :
- I was sure it was okay
- At some point it was discovered that something was wrong
- For this bug, we can not only use a variety of ways to locate the location, but also consider the location in the introduction time, to help quickly narrow the scope of the problem code.
- Because if we can find a specific Commit that has the problem we care about, and its Parent Commit is fine,
- The Changeset generated by this Commit is highly likely to be the scope for introducing bugs.
The Git bisect command introduced in this series is one of the most effective ways to quickly locate a “first-bug-introduced Commit.”
Content plans for this series
The bisect command series is expected to be shared in seven articles. The following is a general outline of the content (disclaimer: During the creation process, there is a certain probability that the content will be adjusted according to the actual situation, and this outline may also be changed) :
- #1: Basic Introduction & Case 1 Linear Submission (<== this article)
- #2: Case 2 includes a merge submission
- #3: Case 3 contains a fallback merge submission
- #4: Extension commands: skip, run
- #5: Algorithm parsing: the selection of the median Commit
- #6: Algorithm parsing: About Skip
Basic introduction to
git bisect
Git is a git command that uses binary lookup methods to locate the first introductionbug
The Commit.- But more broadly speaking,
git bisect
Can be used to find any that meets certain requirementsCode changes
The first introduction of the Commit.- It doesn’t have to be there
debug
The command can be used only in the scenario where - For example, you can use it to find the first Commit that completed development for a particular feature
- Or even the other way around, you can use it to locate something
bug
The first Commit of the fix
- It doesn’t have to be there
Relevant concepts
Based on the context in which git is used to locate a Commit, we define some basic concepts:
- Code change: Refers to a specific code change that produces a specific code or non-code characteristic as a result of that change
old commit
: Don’tA COMMIT that contains this code changenew commit
: a COMMIT that contains this code change
The above concepts can be further understood in the debug category as follows:
- A code change can be understood as a code change that directly causes a target bug
good commit
: the correspondingold commit
, does not include code changes that cause bugsbad commit
: the correspondingnew commit
Contains code changes that cause bugs
Monotonicity of Commit state
Secondly, binary search is the basic idea of Git Bisect, we need to make a setting, that is:
"Commit old (good) | new (bad) state, should be submitted in the commit order, present the monotonicity."
(Where red represents new (bad), green represents old (good), and gray represents undetermined)
In linear Commit
- When a Commit is recognized as new (bad), subsequent commits are new (bad)
- When a Commit is identified as old (good), all prefixes are old (good).
In a related Commit involving a merge
- When a merge Commit is recognized as new (bad), subsequent commits are new (bad), just like linear commits.
- When a merge Commit is identified as old (good), Parent Commits are old (good)
- When one Parent Commit is identified as new (bad), both merge Commit and subsequent Commit are new (bad).
- When one of the Parent Commit is identified as old (good), just like a linear Commit, only the pre-commit of the Parent Commit is old (good).
Illegal situation
- On the contrary, if the following situation actually exists, then
bisect
Method could not be found correctly- In this case, you either need to redefine the meaning of old (good)/new (bad), or you need to strictly control the base scope of binary lookup
- NOTE: This is actually why
bisect
In addition to good and new, another set of terms for old and new is provided, even allowing the operator to specify the terms for both (terms
)
A simple case
Next, we’ll share three examples to see how Bisect works:
- Case 1. Linear Commit (without Merge Commit)
- Case 2. Compliant Commit containing Merge Commit
- Case 3. Commit with rollback Merge Commit (intuitively, does not comply with the above rules)
Case 1. Merge Commit
Here’s some basic background for this scenario:
- The following text description can be understood with the following illustration
- The REPO has only the master branch and a total of 18 commits
- In the REPO, there is only one
hello.md
File, which represents the content of the code we care about
- In the REPO, there is only one
- It has been made clear that the fifth commit is good and contains
IMPORTANT CODE
This line of code- Pay attention to: To make sense of it, let’s use terms good/bad for the moment
- Commit #5 hash = fd49…
- Also, it was clear that the last commit (commit 18) was faulty, and that this critical line of code was missing
- Commit #18 hash = 73b8…
- The goal of this case is to quickly locate “the first commit that this line of code was deleted”
A little extra explanation:
- The reason for using
Lost lines of code
As a bug basis, because:git blame
In itself, this is a valuable way to track down problematic code- In some scenarios where the problem can be inferred from key code characteristics, use
blame
May also be a way of thinking 😉 (hope you have more ideas to open)- But in the
Lost lines of code
In the case of,blame
In a complex real commit environment, this is not enough
Before we begin, let’s use the linear commit scenario to understand how Git Bisect does binary lookup, as shown below:
- After setting the initial Good COMMIT and bad COMMIT, the search begins
- At this time,
git bisect
The algorithm will automatically locate oneThe median Commit
We just need to check if the Commit is good or bad - If it is good, then by inference from the above rules, all pre-commits should be good, then
git bisect
One more is located in the remaining half of the CommitThe median Commit
Continue to check - If this is bad, then subsequent commits are bad, and execution continues
- . until
bisect
Program locates toThe first problematic Commit (bad Commit)
bisect start
First, we’ll start a Bisect chase:
# git bisect start [<bad> [<good>...]]
git bisect start 73b8 fd49
The above command can be equivalent to:
git bisect start
git checkout 73b8
git bisect bad
git checkout fd49
git bisect good
Copy the code
As you can see, the bisect program sets the current median Commit to Commit #11, between Commit #18 and Commit #5, by dichotomy
bisect good / bad
Commit #11 is good by looking at the CODE where Commit #11 is located.
So, we tell Bisect this: Commit #11 is good.
Commit #14 missing lines of code, bad
So I went on,
The review knows that Commit #13 is good,
According to Commit #14’s actual code Commit confirmation, it was indeed the first Commit to introduce the problem
bisect log / replay / reset
- Ok, once we locate the problem, we can use it
git bisect log
Print and record the search process- We can also use log
git bisect replay
This command makes it easy to reproduce the lookup process - This can be useful because if there is any human error in the process, you can quickly go back to one of the previous steps by exporting logs to eliminate the wrong steps
- We can also use log
- Finally, it can be used
git bisect reset
This command ends the search and goes back to the commit where it started