“Feng, what can you say about Git? It’s just Git add and Git commit.” Xiao Ming said contemptuously when I heard that I was going to write a Git tutorial. “…” .
Xiao Ming is one of my students. Currently, I am an Android development engineer.
A few days later, I saw Xiao Ming again.
“Brother Feng, today, I created a new version library on Github, but it was rejected when I tried to push it remotely after submitting it locally. What happened?”
The following is Xiao Ming’s operation record:
git init
git add .
git commit -m "Init commit"
git remote add origin [email protected]:xiaoming/xxx.git
git pull origin master
Copy the code
The above action triggers the following error:
From [email protected]:xiaoming/xxx.git
* branch master -> FETCH_HEAD
* [new branch] master -> origin/master
fatal: refusing to merge unrelated histories
Copy the code
“Xiao Ming, pay attention to the last hint. There are two solutions to this problem.”
git pull
The command actually triggers the pullgit fetch
And mergegit merge
Two operations. Local and remote repositories are irrelevant until the first pull or push is complete, which Git does not allow by default to avoid unnecessary merges. But you can add them manually--allow-unrelated-histories
Force a merger. That’s plan one.
git pull origin master --allow-unrelated-histories
Copy the code
- In scenario 2, from what you did above, you just initialized a version library locally and completed the basic commit. Next, you want to associate the remote repository with pushing the commit to the remote. In this case, you probably don’t need the remote default data (usually an empty README file). So, you can add
-f
Parameter to force a commit and override the remote repository.
git push -f origin master
Copy the code
Xiaoming nodded thoughtfully. This was the first time xiaoming had encountered Git problems. I think he’s going to have a better time.
Unexpectedly, a few days later, I received a message from Xiao Ming. This time, he sent a complaint about Git.
“Feng, Git is so annoying. There is an error in the commit log and it cannot be modified. You know sogou input method is sometimes not intelligent enough, input too fast carelessly lose wrong… 😓”
“🙂, you boy, don’t jump to conclusions. Git allows you to modify commit records. The most comfortable thing about using Git is that it will always give you a chance to back out. No other version control tool can do that!”
“Oh, I see! Tell me how to do it.” Xiao Ming is already looking impatient.
“The git commit command has an argument called amend to solve this problem. So, if it’s a recent commit, you just need to follow these commands.”
git commit --amend -m "This is the new commit log."
Copy the code
After reading my message, Xiao Ming sent me a smiling face. Xiao Ming’s complaint reminds me of an angry and funny rural saying “if shit does not happen, it will be blamed on the pit”, haha.
I thought everything was gonna be okay. Unexpectedly, after about a month, I suddenly received an urgent phone call from Xiao Ming. Telephone that end, xiao Ming seems to be in a very irritable mood.
“Feng, I accidentally restore the operation, I wrote all the code lost. Thousands of lines of code. The version will be released tomorrow night. Is there any way to get it back?”
Hearing this news, I calculated that there was about a 50% chance that I could not find it. The kid is careless and probably didn’t commit to the repository at all. But if he happens to commit to the repository, he might be saved. So I comforted him and said, “Xiao Ming, don’t worry! You open TeamViewer and I’ll look at it remotely.”
After connecting to the machine, I used the history command to see that Xiao Ming used git reset –hard XXX to reset the commit. — Hard is the only unsafe operation in the Git reset command that actually destroys data, so that you can’t see the operation log in the Git log at all. Git, however, is smart enough to keep another log called reflog, which records every time you change the HEAD. Therefore, you can restore the data by using the following command:
Git reflog c8278f9 (HEAD -> master) HEAD@{0}: reset: moving to c8278f9914a91e3aca6ab0993b48073ba1e41b2b 3e59423 HEAD@{1}: commit: a c8278f9 (HEAD -> master) HEAD@{2}: commit (amend): v2 update 2dc167b HEAD@{3}: commit: v2 2e342e9 HEAD@{4}: commit (initial): Init commitCopy the code
Git reset is available in version 3e59423. So we can go back to this version again with the git reset command:
git reset --hard 3e59423
Copy the code
Once you’ve done that, you’ll be pleasantly surprised to find that your lost data magically comes back.
“Thank you feng brother!! 🌺 🌺 🌺”
“Don’t do that next time. Plus, how can you lose so much code at once. Be sure to submit frequently.” Xiaoming appeared such a problem, and the usual non-standard operation is inseparable. So I gave him one last warning.
“Ok, I see. By the way, I have a question I’m still confused about. Git checkout: Git reset When I used SVN, git checkout was used to checkout code. You can use it to switch branches or specify versions in git, but git reset also works. Are they exactly the same?” Xiao Ming sent me a reply message in QQ.
“This is a deep question and it will take a little time to explain. Next, you listen carefully.”
Understand the Git workspace
Before we get to that, let’s take a quick look at some Git basics. Git has three states:
- Commited: Data has been completely saved to a local database
- Modified: A file has been modified but has not been saved to the database
- Staged: A file has been marked for modification to be included in the next version snapshot submitted
These three states correspond to three Git work areas: Git version repository, staging area, and workspace
Git clone is the repository where Git is used to store metadata and object databases for your project. This is where the Git clone command copies data.
A working directory is something that is checked out independently of a particular version and is available for you to use and modify.
The staging area corresponds to a file named index inside Git, which holds the list of files to commit next time. For this reason, the staging area is sometimes called an index.
A basic Git workflow is as follows: 1) Modify files in the workspace; 2) add files to the staging area with Git add, that is, to the index file; 3) use Git commit to permanently save the list of files recorded in the staging area to the Git repository using a snapshot
Understand the HEAD
To explain this, you also need a simple understanding of what HEAD is. Simply put, HEAD is a pointer to the current branch reference, which always points to the last commit on that branch. To make it easier for you to understand a HEAD, you can think of a HEAD as a snapshot of the last committed data.
If you are interested, you can use an underlying command to view the snapshot information for the current HEAD:
git ls-tree -r HEAD
100644 blob aca4b576b7d4534266cb818ab1191d91887508b9 demo/src/main/java/com/youngfeng/snake/demo/Constant.java
100644 blob b8691ec87867b180e6ffc8dd5a7e85747698630d demo/src/main/java/com/youngfeng/snake/demo/SnakeApplication.java
100644 blob 9a70557b761171ca196196a7c94a26ebbec89bb1 demo/src/main/java/com/youngfeng/snake/demo/activities/FirstActivity.java
100644 blob fab8d2f5cb65129df09185c5bd210d20484154ce demo/src/main/java/com/youngfeng/snake/demo/activities/SecondActivity.java
100644 blob a7509233ecd8fe6c646f8585f756c74842ef0216 demo/src/main/java/com/youngfeng/snake/demo/activities/SplashActivity.java
Copy the code
Here’s a quick explanation of what each field means: 100644 represents the file mode, which corresponds to a normal file. Blob represents Git’s internal storage object data type, and there is another data type, tree, which corresponds to a tree object. The long string in the middle corresponds to the sha-1 value of the current file.
So, simply put, HEAD corresponds to a tree that stores all Git object snapshots of the current branch:
Let’s use a table to briefly summarize the above points:
HEAD | Index(Temporary Storage Area) | The workspace |
---|---|---|
The last committed snapshot and the parent node of the next committed snapshot | Expected next commit snapshot | The sandbox directory that you are currently operating on |
Git checkout: Git reset: Git checkout: Git checkout
Let’s use a simple example to see what happens with Git reset. Create a Git repository and trigger three commits:
git init repo
touch file.txt
git add file.txt
git commit -m "v1"
echo v2 > file.txt
git add file.txt
git commit -m "v2"
echo v3 > file.txt
git add file.txt
git commit -m "v3"
Copy the code
With that done, the repository now looks like this:
Next, run git reset 14ad152 to see what happens. The following is the result after the command is executed:
git log --abbrev-commit --pretty=oneline
### This is output ###
14ad152 (HEAD -> master) v2
bcc49f4 v1
git status -s
### This is output ###
M file.txt
cat file.txt
### This is output ###
v3
Copy the code
You can see that the version of the library file is reverted to V2, and the workspace file content is the same as the previous version V3. To confirm what has changed in the staging area, let’s use another underlying command to compare the staging data with the repository data:
# Query temporary storage information
git ls-files -s
### This is output ###
100644 8c1384d825dbbe41309b7dc18ee7991a9085c46e 0 file.txt
# Check version library snapshot information
git ls-tree -r HEAD
### This is output ###
100644 blob 8c1384d825dbbe41309b7dc18ee7991a9085c46e file.txt
Copy the code
As you can see, the current version of the library and the staging area information is exactly the same. The HEAD points to the V2 commit and shows the whole process as a graph, which should look like this:
Take a look at the image above to understand what just happened: First, the HEAD pointer moves to V2 and undoes the previous commit. Currently, both the repository and staging area hold the record of the second commit, while the workspace holds the most recent change. If you think about it a little bit, you’ll see that this git reset command is just the reverse of the most recent commit. Bring the data back to exactly where it was before the last submission. So, if you want to undo the most recent commit, you can do so.
Added –soft parameter test
Git help reset git help reset git help reset
git reset [-q] [<tree-ish>] [--] <paths>... git reset (--patch | -p) [<tree-ish>] [--] [<paths>...] git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] [<commit>]Copy the code
Reset command can also be followed by 5 different parameters: –soft, –mixed, –hard, –merge, –keep. Here we will focus on the first three, of which –mixed has actually been tried just now and has the same effect as the git reset command with no parameters. In other words, –mixed is the default behavior of git reset. Next, run git reset –soft 14ad152 to see what happens. Once the command is executed, it is customary to use the same base command to see what happens:
git log --abbrev-commit --pretty=oneline
### This is output ###
14ad152 (HEAD -> master) v2
bcc49f4 v1
git status -s
### This is output ###
M file.txt
cat file.txt
### This is output ###
v3
Copy the code
Strange? Why is the result exactly the same as the last execution without any parameters? Git has a design error. Believe you see the result can have such doubt certainly, in fact otherwise! This is because I pasted the output with text and ignored the font color of the command. In fact, the color of M in the output of the second command is different from that of the last one. To see the difference, take a look at the screenshot below:
git commit
git ls-tree -r HEAD
### This is output ###
100644 blob 8c1384d825dbbe41309b7dc18ee7991a9085c46e file.txt
git ls-files -s
### This is output ###
100644 29ef827e8a45b1039d908884aae4490157bcb2b4 0 file.txt
Copy the code
As you can see, the sha-1 output of the two commands is inconsistent, confirming our conjecture.
Here we can draw a conclusion: — Soft differs from the default behavior (–mixed) in that –soft adds the latest file version of the workspace to the staging area one more step. You can use this command to merge commits. That is, if you have unfinished work in a commit and you go back on your word, you can use this command to undo the commit and complete the commit once the work is done.
Added –hard parameter test
Next, we tested the last parameter, which was also a parameter that xiao Ming had problems with in the process of using. Git reset –hard 14ad152 to see what happens:
git log --abbrev-commit --pretty=oneline
### This is output ###
14ad152 (HEAD -> master) v2
bcc49f4 v1
git status -s
### This is output ###
>>> No output <<<
cat file.txt
v2
Copy the code
Git status -s shows that the current workspace, staging, and repository data are identical. Check the file content and find that the file is back to the V2 version. Normally, if you see this, you’ll be surprised to find that your last data submission is completely missing. Indeed, this is one of the few Git commands that actually destroys data. Try not to use this command unless you know exactly what you are doing!
Once again, we use a diagram to completely describe what this command sends:
As you can see, — Hard also restores the workspace data to version V2, relative to the default behavior, so that the V3 commit is completely lost.
git checkout
Git Checkout 14ad152 git Checkout 14ad152 git Checkout 14ad152
git log --abbrev-commit --pretty=oneline
### This is output ###
14ad152 (HEAD -> master) v2
bcc49f4 v1
git status -s
### This is output ###
>>> No output <<<
cat file.txt
v2
Copy the code
Git checkout > git reset-hard > git reset-hard > git reset-hard Does that mean there is no difference at all? Of course not. Strictly speaking, there are two “essential” differences:
- Relatively speaking,
git checkout
Is safe for the working directory, it does not restore files that the workspace has modified,git reset
No matter the three – seven – one head all reduction. - Another important difference is that,
git checkout
Instead of moving the HEAD branch, it does this by directly modifying the HEAD reference.
The second difference is a little harder to understand, so let’s use a graph to illustrate the difference more visually:
In simple terms, Git reset moves a pointer to the HEAD, while Git Checkout moves a pointer to the HEAD itself.
The command works on some files
Git reset and Git checkout can also work on a file, or part of a file, with a file path. In this case, the two commands behave differently. Git reset 14ad15 — file. TXT to restore the file to version V2. Once the command is executed, use some basic commands as usual to see what happens:
git log --abbrev-commit --pretty=oneline
### This is output ###
4521405 (HEAD -> master) v3
14ad152 v2
bcc49f4 v1
git status -v
### This is output ###diff --git a/file.txt b/file.txt index 29ef827.. 8c1384d 100644 --- a/file.txt +++ b/file.txt @@ -1 +1 @@ -v3 +v2 cat file.txt v3Copy the code
As you can see, neither the repository nor the workspace data has changed. The only thing that changes is the staging area, where the next committed change will cause data to revert from V3 to V2!
Git restores the staging and workspace file versions to V2, and then to V3. Unlike –hard, this command does not overwrite the files that have been modified in the workspace.
Git checkout with a path is a slightly different operation than git reset. Git checkout with a path overwrites any changes that have been made to your workspace, resulting in data loss, and is an unsafe operation.
For all of the experiments above, we used a simple table to summarize the differences and whether the operation was safe:
Execute without path
The command line | HEAD | The staging area | The workspace | Directory security |
---|---|---|---|---|
git reset [–mixed] | YES | YES | NO | YES |
git reset –soft | YES | YES | NO | YES |
git reset –hard | YES | YES | YES | NO |
git checkout | Modify | YES | YES | YES |
Execution with path
The command line | HEAD | The staging area | The workspace | Directory security |
---|---|---|---|---|
git reset — | NO | YES | NO | YES |
git checkout | NO | YES | YES | NO |
“Xiao Ming, do you understand?” After the message is sent, there is no response for a long time. “Oh, the boy! Probably fell asleep… 😆”
Two years have passed since this question about Git, and Xiao Ming has not asked any more questions about Git. And yesterday, suddenly received the message from Xiao Ming.
“Brother Feng, I am an Android Leader now. Now the Android team consists of 6 people. We are making a social application, and I still find some problems in Git management. One of the problems is that there are now several branches of the build library, of which development is mainly in the Develop branch. The trunk branch is master and is mainly used for release. But there are other branches that are in disarray. Is there anything that can be done about it?”
“There is an accepted and good design for Git branch design called Git Flow model. For Git Flow models, check out this article at nvie.com/posts/a-suc… Learn about it.”
“Good! Another problem that has bothered me for a long time is that people write general submission logs. It is very inconvenient to find the problem, and most students submit many documents at one time, which makes it impossible to accurately locate the specific submission when solving the problem. I tell people to commit changes as small as possible. But I don’t know where to start when asked about the submission rules…”
“That’s a good question. A common problem with Chinese programmers is that they want to do as much code as they can submit in a lifetime. Some even fobbed off the accountability person with the excuse that multiple submissions were too much trouble. In short, the submission principle can be summed up in one sentence: One idea, one submission. Plus, you’re right, submissions must be as small as possible and comments must be as accurate as possible!”
After telling Xiao Ming so much about Git, I could not help asking him half-jokingly, “Xiao Ming, do you still think Git is easy?”
Xiaoming sent a helpless expression! “Before, I had no idea that There were so many ways to play Git, so I can’t help but give a thumbs-up to the inventor of Git. By the way, Feng, who developed Git?”
“The Git story is already rotten on the Internet. Let me give you a brief introduction. Git was born by accident. Its original mission was to manage services for Linux kernel code. In the early days, the Source code for the Linux kernel was managed using the Bitkeeper version control tool. Later, however, Bitkeeper asked the Linux community to pay for its use because of some interest. Angered by this move, Linus, the founder of Linux, decided to develop a distributed version control system of his own. Within a few weeks, a prototype of Git was born and started to be used in the Linux community. Although Linus is the founder of Git, the most important person behind it is Junio C Hamano, a Japanese. Linus has committed 258 times to the Git open source repository, while Junio C Hamano has committed over 4000 times. In other words, management of the project was handed over to the Japanese shortly after Linus developed it. If you’re interested in Junio C Hamano, Google it. He now works at Google and, like Linus, keeps a low profile. “
“This story also tells me: Don’t challenge a programmer with technology @_@”
After telling this story, the story of Xiao Ming and Git has come to an end. In fact, there are some more common questions, Xiao Ming did not ask. Here, I’ve prepared an appendix to show you some common commands to help you solve everyday problems. It is very useful, be sure to take notes, or save this article for later use.
Q&A
Problem 1: The company’s Git server is set up on an Intranet server. I want to submit the code to OsChina at the same time, so that I can pull the code at home and work remotely. What should I do? Git is a distributed version management system. To achieve this requirement, you can use Git remote add command to add multiple remote version libraries.
git remote add company git@xxx
git remote add home git@xxx
Copy the code
Git will prompt you to commit the code to the repository if it has not committed the code before pulling remote code. But I don’t want to submit it for the time being, what should I do? To address this problem, Git provides a temporary area for storing unwanted records. The corresponding command is Git Stash. In general, you can do this:
# Save any data that you don't want to commit to a temporary area. Once saved, the workspace will be exactly the same as the repository
git stash
Restore stash data to workspace
git stash apply
After the above operation, stash data is still stored in the temporary area. To delete this part of data, use the following command.
git stash drop
If you want to delete data from the temporary region while restoring data, you can do this:
git statsh pop
If you want to delete only one of the temporary records, specify the corresponding index data.
git stash pop/drop stash@{index}
To view all data in the temporary area, run the following command:
git stash list
Copy the code
Question 3: as the project leader, I want to find out the “culprit” of the problem code quickly. Is there any way? The best answer to this question is git blame. Using this command and specifying a specific file it will display the most recent changes to each line of code in the file so that you can clearly see who made the most recent changes.
conclusion
Git is an excellent version control system that I highly recommend you use in your daily development. This article explains solutions to several common problems from Ming’s point of view, and no doubt you may encounter others as well. When faced with a problem, you can try Google search solutions. Leave a comment below and I’ll be happy to answer your Git questions.
I am Ouyang Feng, I would like to saddle up for you and help you rise to the top. If you enjoyed my article, please leave your love mark below. If you don’t like my article, please like my article first, then leave the mark of love.
See you next time! Bye bye!