Git is a version management system that uses version control to record changes in the contents of one or more files for future reference. Git is used for code versioning during our development, and this article will delve into how Git works and what the various commands we use represent. The main content is divided into the following parts
- Git principle
- The git command
- Project related
- The snapshot related
- Branch related
- View and compare
- The patch
- Other Advanced Commands
- The underlying command
- git hooks
- conclusion
Git principle
To be a decent version control tool, Git does the following
- Single file save
- Multiple file associations
- Multi-version management
- File storage and optimization
To use git, execute git init in your project file to generate a git directory that initializes Git in order, or go back to Part 4 of this section to see what it means.
Git implements the above functions from the perspective of low-level commands (there are many low-level commands involved, you can first refer to the relevant chapters to understand, if the actual process of obtaining the hash value is different from the example, it may be the default encoding problem).
1. Save single files
Git hash-object git hash-object git hash-object git hash-object git hash-object git hash-object
echo 'version 1' > test.txt
git hash-object -w test.txt
//83baae61804e65cc73a7201a7252750c76066a30
Copy the code
Git cat-file git cat-file git cat-file git cat-file git cat-file git cat-file
git cat-file -p 83baae61804e65cc73a7201a7252750c76066a30
//version 1
git cat-file -t 83baae61804e65cc73a7201a7252750c76066a30
//blob
Copy the code
Git implements the function of a key-value database. A single file is stored as a blob, which is one of the git types.
2. Association of multiple files
The BLOB object only holds the contents of the file, missing key information: A tree object can be thought of as a directory. A directory category contains multiple files (blob objects) and other directories (other tree objects). The tree object is generated based on the contents of the temporary region at a certain time. So we first use git update-index to put the file into staging.
git update-index --add --cacheinfo 100644 83baae61804e65cc73a7201a7252750c76066a30 test.txt
Copy the code
Write the staging area to a tree object using git write-tree
git write-tree
//674d4d31b97233152f3be1825cc9e765fa2b2859
git cat-file -p 674d4
//100644 blob 83baae61804e65cc73a7201a7252750c76066a30 test.txt
git cat-file -t 674d4
//tree
Copy the code
Of course, you can also add multiple files to the staging area or write another tree object to the tree object. Now write multiple blob files to the tree object.
echo 'another file' > another.txt git hash-object -w another.txt //17d5d9edf31a80878ad4911017cbd6d1d03322b8 git Update - index - add - 100644 17 cacheinfo d5d9edf31a80878ad4911017cbd6d1d03322b8 another. TXT / / add just a blob object is in the git update-index --add --cacheinfo 100644 83baae61804e65cc73a7201a7252750c76066a30 test.txt git write-tree //18ccf7f4dc54058357925cdf2014a9210b0a21a8Copy the code
3. Multi-version management
Now we can create tree objects, but we are one step away from the concept of versioning, which is to store tree objects in some structure to facilitate further processing of versioning. In Git, this data structure is known as a directed graph. Each node is a commit object, which we will discuss later. There is a pointer to the parent commit, which holds a tree object and commit information. Use git commit-tree to write the specified tree object to the commit object, run -m to add the remarks, and run -p to specify the parent commit object
/ / create the first commit object git commit tree d4d31b97233-674 - m 'first submit / / e9cd7b7e8b607ce2881d2512e420c2e301310809 / / create a second commit object, And for the first parent object git commit - 18 ccf7f4dc5405 tree - m 'second submit' -p e9cd7b7e8b607ce2881d2512e420c2e301310809 / / f0c44a930d40d5ee4be2698c786f2f5a83ef4b80 / / view the git commit object cat - file - p f0c44a930Copy the code
Each COMMIT is equivalent to a version for managing each version. Let’s give it a readable name
git update-ref refs/heads/test f0c44a930d40d5ee4be2698c786f2f5a83ef4b80
Copy the code
Git /refs/heads/test changes to the hash value of the commit object. This is the essence of a Git branch, that is, a pointer to a Commit object that makes up the contents of the branch along with all its parent commits. Creating a new pointer to a COMMIT object generates another branch.
We refer to Pointers to commit objects as references (refs), and each reference is a file in the.git folder containing either a hash value or another reference, in addition to branch references
- HEAD points from the current branch; by default, the master branch (which is initialized from no commit object lit Branch; this branch is called unborn Branch
git fsck
View), which changes as the operation branch switches - Git has four main types of tag objects, including blob objects, tree objects, and Commit objects, as well as the tag object, which contains a tag creator, a date, a comment, and a fixed pointer to a commit object. Is a reference to a single COMMIT object.
- A remote reference is a remote reference corresponding to a local branch, which is updated during push or fetch. You can access it in the following three ways. For example, the master is equivalent
$ git log origin/master
$ git log remotes/origin/master
$ git log refs/remotes/origin/master
Copy the code
4. File storage and optimization
This section discusses how the various objects we operate on above, and even the code we manage daily, are stored. Take a look at the contents of the.git directory that you created when you initialized git
├─ hooks/ Is a set of commands to be executed when the program runs to certain phases, such as pre-commit. This directory contains template files for various hooks. ├─ info/ ├─ Exclude For those not wishing to be recorded in.gitignore files ├─ local branch ├─ tags/ Stash Store ├─ Stash Store Entries ├─ Remotes ├─ Origin/Version library Default name: Origin, you can also have other remote version libraries ├─ config for git related configuration of the current project, including the url of the remote repository, upstream of different local branches (i.e. Git fetch or Git push) │ ├─ description │ for gitWeb use │ ├─ HEAD │, store the branch │ ├─ index │, store the contents of the new ├─ COMMIT_EDITMSG Used in the Git Hook sectionCopy the code
When we access a file locally, we are manipulating the contents of the.git/objects directory.
$ find .git/objects -type f
.git/objects/17/d5d9edf31a80878ad4911017cbd6d1d03322b8
.git/objects/18/ccf7f4dc54058357925cdf2014a9210b0a21a8
.git/objects/4b/825dc642cb6eb9a060e54bf8d69288fbee4904
.git/objects/59/4dc0e39bc4468ee19c67e65d37b97eb963b68b
.git/objects/67/4d4d31b97233152f3be1825cc9e765fa2b2859
.git/objects/e9/cd7b7e8b607ce2881d2512e420c2e301310809
.git/objects/f0/c44a930d40d5ee4be2698c786f2f5a83ef4b80
Copy the code
All historical versions of files are stored in the local Git repository, and each clone is equivalent to a full backup of the remote repository. According to Git’s storage policy, files of each version are completely saved and packaged into binary packfile according to certain conditions. Another index file is provided to quickly locate each object. The packing process can be manually performed through Git GC
$ git gc
Enumerating objects: 6, done.
Counting objects: 100% (6/6), done.
Delta compression using up to 6 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (6/6), done.
Total 6 (delta 0), reused 0 (delta 0)
Computing commit graph generation numbers: 100% (2/2), done.
Copy the code
Pack the result
$ find .git/objects -type f
.git/objects/4b/825dc642cb6eb9a060e54bf8d69288fbee4904
.git/objects/info/commit-graph
.git/objects/info/packs
.git/objects/pack/pack-1b0fbf5e906f40ad4387962cf3efe5ede74aa2a0.idx
.git/objects/pack/pack-1b0fbf5e906f40ad4387962cf3efe5ede74aa2a0.pack
Copy the code
The git command
Git command is the API that Git provides to interact with it. Before explaining the specific command, we will make a corresponding to the concept in git universe and the specific file corresponding to the reality. Here I call it the five space and three state and a common thread (in order to systematically understand the git, such unofficial concept definition, just for mnemonic please pay attention to distinguish if necessary), all operations are basically in a state between each space, and will these together is a common thread, this is my refining essence of git. As shown in figure
The five Spaces are:
- Working directory, also known as working tree, is the project folder that we are currently modifying.
- Git /index is a staging area between a working directory and a local repository. It is located in a. Git /index file
- Git /objects/ contains all local references to objects and all historical versions. We’ll focus on two types of references in the.git/refs directory
- Heads, which holds Pointers to local branches, such as the master branch. When we do a Git commit, we point to a newly generated commit object based on index, as well as other operations that affect local branches.
- Remotes, which is a local copy of the remote repository, is the latest known state of the corresponding remote branch to which files are synchronized from the remote repository during git fetch for further merge, etc., and from the local branch during Git push. The difference from the local branch is that it is not editable, that is, you cannot point your HEAD to it, and can be used as a staging post for the remote repository branch to synchronize to the local branch.
- Remote repository: remote repository for collaboration and online backup. Github, for example, is our online repository.
- Stack, the place where code is temporarily stored when there are multiple stask entries, is located in. Git /refs/stash, which is the hash value of a commit object.
The three states are:
- Tracked file, which is managed by Git, is subdivided according to the space it is in:
- Commited, located in both local and remote repositories, has committed code to the repository using git commit
- Staged, the code in index, also known as cached
- Unmodified, in the working directory, is the same version as in index
- Modified: a file that exists in both the working directory and index but has its contents modified
- Stashing, located on stack
- Untracked, is in the working directory, but index doesn’t exist, just added an untracked file
- Discarded: code that is not in one of the five largest git repositories in the world and is discarded irreversibly
The main thread is that a project usually has only one master branch, master, and the other branches are branches based on a COMMIT object of master. To better understand the operational details of the following commands, let’s review some of the concepts mentioned earlier: When we do git init, we default to an empty branch, master, that has no commit on it. When we do something else and git commit, this branch is a true branch.
There are two common tags used when we want to access a parent commit based on a branch: ~n and ^n, where head~n indicates the NTH generation of head’s ancestor commit object, and n defaults to 1. Head ^n indicates the NTH parent commit object when there are multiple parent commits, and n defaults to 1
G H I J \ / \ / D E F \ | / \ \ | / | \|/ | B C \ / \ / A A = = A^0 B = A^ = A^1 = A~1 C = A^2 D = A^^ = A^1^1 = A~2 E = B^2 = A^^2 F = B^3 = A^^3 G = A^^^ = A^1^1^1 = A~3 H = D^2 = B^^2 = A^^^2 = A~2^2 I = F^ = B^3^ = A^^3^ J = F^2 = B^3^2 = A^^3^2Copy the code
A version of versioning refers to a COMMIT object, and there are many references to the commit object, including
- The branch name always points to the latest commit, and subsequent operations continue based on that latest commit. Each time a new COMMIT is created, the branch points to a new commit.
- Tags point to fixed COMMIT objects
- Head points to the working branch, and you can switch heads by switching branches
Now that we have an intuitive understanding of the concepts in Git, we will introduce how the corresponding commands influence the above three main concepts when we introduce the following commands.
1. Project related
These are project-level commands that are the basis for other commands
init
Initialize a Git repository
config
Git configuration files are generally divided into project level and global level, the latter add –global parameter. Git config -l
clone
Clone the remote repository from the url provided, which is a series of commands encapsulated:
- The new directory
- Enter the directory
- Git init initializes a new repository
- Git remote add origin [repo-url] git remote add origin [repo-url] git remote add origin [repo-url
- Git Fetch synchronizes remote repositories to local repositories
- Git checkout checks out the last commit for the current branch
2. Snapshot related
This refers to basic operations that do not involve crossing branches.
add
Update the index using the contents of working Tree
commit
Create a new COMMIT according to index, head pointing to the new commit.
- Use -c or — reusability -message= to reuse a COMMIT’s information
- Create a new commit to replace the last one with –amend –no-edit and reuse its commit information
rm
To delete a specified file from the working tree and index, or only from index. If the file is only deleted from the Working tree, run the /bin/rm command. –ignore-unmatch indicates that no error is reported even if no file is matched
Git rm [-r] [--] [< pathSpec > // Git rm --cached [-r] [--] [< pathSpec > // Delete the file from the index and keep it in the working treeCopy the code
mv
Shortcuts to add and rm for file renaming
restore
Restores some specified paths based on the specified source. If a path is traced and not in the specified source, it is deleted directly, for example
Git restore [--] <pathspec> Git restore --worktree [--] < pathSpec > git restore [--source=<tree>] --staged [--] < pathSpec > Git restore --source=<tree> --staged --worktree [--] < pathSpec > // Restore both index and working tree from specified COMMITCopy the code
clean
To remove a file from the workspace untracked that is not in.gitignore, add -f to force the deletion or -i to go to the interactive page
stash
Clean up the changes to the workspace and index, and save the changes to the stack
Git stash pop [--index] [stash_id] git stash pop [--index] Use stash_id to specify the stash entry to be restored. When this is done, the corresponding stash entry (stash entry) will be cleared with git stash apply [-- index] [stash_id] // As with pop, Do not clear the corresponding stash entry git stash drop [stash_id] // Remove one stash entry git stash clear // Clear all stash entriesCopy the code
reset
Modify the pointing of the head pointer, involving the workspace, staging, and changes between the currently pointing COMMIT and the latest pointing COMMIT. There are two kinds of syntax:
The first:
Git reset [<tree-ish>] [--] < pathSpec >...Copy the code
The path pathSpec is mandatory, indicating that the head will not be moved and the workspace will not be modified, but that the specified COMMIT will be placed in the index, for example, using the default head. Index is the same as head, which means that the index part is returned to the workspace
The second:
git reset [--soft | --mixed | --hard | --merge | --keep] [<commit>]
Copy the code
Specify mode and commitID respectively, and switch head to commit
- Soft adds the change to the worktree (worktree) and index (worktree)
- — Mixed default mode, add changes in index and changes caused by switching head to working Tree, clear index
- — Hard Clears index and working tree, discarding changes caused by switching head
- If the discarded part overlaps with the unadded file, it fails to reset the merge
- Keep puts uncommitted changes in the Working Tree and discards the changes caused by the head switch. If the file with the head change has been changed locally, it fails.
3. Branch correlation
This refers to operations involving different branches
branch
Perform operations on branches
Git branch -- git branch -- git branch -- git branch -- git branch -- git branch -a Displays all branches. Git branch --set-upstream =<remote>/<branch> <branch> //Copy the code
switch
Switch branches. If index or working tree conflicts with the new branch, submit or add –discard-changes or -f to the code involved in index or working. Add -c automatically creates a new branch.
checkout
The latter two commands are a replacement for the former, which is too much work to go into.
tag
Add, delete, change and check tag references
fetch
Download references (branches, tags, etc.) from another repository to a copy of the remote branch of the local repository (in the.git/refs/remotes directory), from the updtream of the local branch by default. Add – Prune or -p can remove branches that are no longer on remote branches before downloading
push
Uploading a local COMMIT to a remote repository using -for –force may modify the history
merge
When two or more branches are merged into a branch (only two branch merges are discussed here), there are two syntaxes. The first is to instruct the behavior when the merge cannot be completed at once, the –continue or –abort statement is used to continue and abort the merge, and the second is to instruct how to merge, as described below.
Common merge scenarios include the following:
- If the commit set of one branch is a subset of the other, and there are no local changes causing conflicts, the merge will be completed directly without new commit. This is called a fast-forward merge, and you cannot intervene in the merge process.
- Two normal, non-conflicting branches are merged directly and a new COMMIT commits the changed code
- If two branches merge and conflict, they are divided into two
- If the conflict is caused by local changes, the merge fails and local changes need to be further processed
- If a conflict is caused by a COMMIT on two branches, the unconflicting part will be saved in index, and the conflicting part will be saved in Working Tree until the conflict is resolved
git merge --abort
Cancel the merge, otherwise the conflict is resolved,git add
And then you can go straightgit commit
, you can also use the commands heregit merge continue
In this case, enter the vi editor and modify the submission information to complete the merge.
You can modify the merge process by adding additional parameters:
- Git merge –abort — git commit — Git merge continue — git merge continue — git merge continue — git merge abort — git merge continue — git merge continue
- –ff by default, merge fast if you can, or create a COMMIT for merge
- –ff-only Automatically cancels when there is a conflict
- –no-ff creates a commit each time
- — Squash If there is no conflict, the difference between two branches is added to the index of the current branch, and the merge record is not displayed. Otherwise, the conflict-free part is in index, and the conflict-free part is waiting for conflict resolution in Working Tree. There is no merged record.
pull
Git merge FETCH_HEAD — git merge FETCH_HEAD — git rebase
4. View and compare
Git is used for viewing without modifying it
diff
Compare working tree with index, index with a commit, different tree objects, different BLObs, and even different files.
Git diff [-] [< path >... Git diff --cached [<commit>] [--] [<path>...] Git diff <commit>... "Commit > [-] [< path >... // Compare two commitsCopy the code
status
Indicate the current state of the repository by explicitly differentiating index from COMMIT, working Tree from index, and files not tracked by Working Tree
log
Explicitly commit history
Git log --oneline --graph -- git log --onelineCopy the code
show
Show different types of objects (Blobs, trees, tags and commits)
reflog
Reference logs are records of local Reference operations. They can be used to find historical records of operations performed on branches, for example, when a branch is deleted.
bisect
Use dichotomy to find the wrong commit. For example, if a change introduces a bug, we need to find out which commit changed incorrectly. Run git bisect start [end commit] [start commit] and switch to the middle commit. If there are no bugs mark git bisect good, otherwise mark Git bisect bad until the last commit is found. **is the first bad commit
blame
Find out who created and modified each line of a file
Git blame [-] < file > | grep search text > < / / each corresponding file change records, git blame - L starting line number, the end of the number of rows [-] < file > / / the number of the corresponding line of the file and modify the recordCopy the code
fsck
File System check, which verifies the connectivity and validity of objects in the database, displays objects that have not been removed from the branch reference and is used to retrieve deleted branches
filter-branch
It can be used to edit the history (the modified objects will be renamed), such as deleting large files or infringing files, passwords, etc
Git filter-branch --tree-filter 'rm filename' HEAD git filter-branch --index-filter 'git rm --cached --ignore-unmatch filename' HEAD // Stop tracking a file at all commitsCopy the code
5. The patch
This refers to the operation of reusing a part of the change.
rebase
Join part of the commit to another commit, that is, pivot. Similar to Git merge, the syntax is split into two classes. The first class is used to guide behavior when rebase cannot be completed at once, except –abort and –continue, and –skip means to skip a commit that conflicts. The second type of syntax indicates how to rebase, including
git rebase [ <upstream> [<branch>] ]
Copy the code
Git rebase
by default, changes that are different between local and remote code will be added to the latest progress of remote code. If you specify branch, you will switch to the corresponding branch and then execute git rebase
. You can also specify, for example, another branch, so that the different parts of the two branches are followed by the specified branch. Or add the –onto parameter
git rebase --onto <newbase> [ <upstream> [<branch>] ]
Copy the code
If we add –onto, newBase will be added to newBase with commit in the middle of [] (open left, close right)
You can add -i, –interactive, to fine-grained the commit moves during rebase, such as sorting or removing. Git rebase head~2 -i — git rebase head~2 -i — git rebase head~2 -i — git rebase head For example, when an edit or swap order conflicts, the commands are
- Pick: The original commit is used by default
- Reword: Use the commit modification, but edit the comment. Exit the edit dialog and the edit page will be displayed
- Edit: Interrupts the rebase to edit the specified commit remarks. There are two options for exiting the current edit dialog, one is to execute
git commit --amend
Edit the commit you just selected and execute it if you are satisfied with the changesgit rebase --continue
Continue to - Squash: This change is used, but it is added to a previous COMMIT. If there is no previous commit, error is displayed
- Drop: Discards the commit
In vi editor, YY copies the entire line. Dd cut whole line, p paste, press V to count into visual mode, y copy selected block, d cut selected block,wq save and exit, q do not save and exit)
cherry-pick
Use this command to make sure that the working tree does not modify the head of the branch
Git cherry - pick < commit >...Copy the code
Among them… It can be a single commit, multiple commits separated by Spaces, or two commits separated by two dots (open at the front and closed at the back). -n updates only index and working tree, and -e edits commit
revert
Change back to the specified COMMIT, that is, generate the same new COMMIT to offset the corresponding commit modification, the syntax is similar to cherry-pick
Git revert < commit >...Copy the code
diff/apply
Generate patch files using DIff and apply patches using Apply, for example
Git diff > [git diff]Copy the code
Git apply [diff filename]Copy the code
(If patching fails, try Encoding changed to UTF-8,end of line sequence changed to LF)
format-patch/am
Git format-patch Generates patches, and git AM applies patches
6. Other advanced commands
Here is a basket into which all the higher orders that are not part of the previous classification are put.
worktree
The repository is used to manage multiple working trees, which is equivalent to working trees under multiple directories, sharing the. Git directory under the main working tree directory. Different branches need to be checked out for different working trees
Git worktree add [-f] [--detach] [--checkout] [--lock] [-b <new-branch>] <path> [<commit-ish>] // Add working Tree git Git worktree remove [-f] <worktree> //Copy the code
submodule
Add one or more independent sub-repositories to a Git repository, such as common code, for independent updates to projects and common code.
7. Underlying commands
Git commands are used for reference only.
hash-object
Computes the hash value of a file and optionally creates a bloB object named the corresponding hash as the ID
Git hash-object -w <file> // writes the specified file as a blob object to the database and returns the id, -w to write the object to the databaseCopy the code
cat-file
List blob, tree, commit, tag object, -t type, -p Pretty-print the contents, -s size,-e to check whether the specified object exists and is valid
update-index
Adds the specified object to index, as in
git update-index --add --cacheinfo 100644 594dc0e39bc4468ee19c67e65d37b97eb963b68b test.txt
Copy the code
Cacheinfo indicates reading from a database, and 100644 indicates file mode, that is, normal files
write-tree
Write index to the tree object
commit-tree
Create a COMMIT object
Git commit - tree [(-p < parent >)... [(-m < message >)... <tree>Copy the code
git hooks
Git hooks are client-side hooks and server hooks that enable Git to trigger custom scripts when specific actions occur. For more hooks, refer to the template approach in design patterns. The hooks are stored in. Git /hooks/ with. Sample as the suffix, which will take effect after editing the corresponding file.
Client-side hook
These include submission workflow hooks, E-mail hooks, and others. We will focus only on the commit workflow hooks, which can be used to check code before committing using the Lint tool and to check the commit information. For details, refer to the front-end code specification tool principles and best practices: Eslint +prettier+gitHooks to bypass these checks, add the –no-verify or -n option.
- Pre-commit Is executed before the commit information is processed and is used to check the snapshot to be committed
- The prepare-commit-msg hook runs after the default message is created, before starting the commit message editor. It allows you to edit the default information that submitters see.
- The commit-msg hook receives a parameter, which is the path of the temporary file mentioned above that holds the current commit information.
- The post-commit hook runs after the entire commit process is complete. It doesn’t take any arguments, but you can easily get the last commit information by running Git log-1 HEAD. This hook is usually used for things like notifications.
Server-side hook
Server-side hooks run before or after they are pushed to the server
- Pre-receive The script that is called first when handling push operations from clients is pre-receive. You can use this hook to prevent non-fast-forward updates to references or to control access to all references and files that are modified by the push.
- Update The update script is very similar to the pre-receive script, except that it runs once for each branch that is ready for an update. Pre-receive only runs once if a pusher pushes content to multiple branches at the same time, whereas Update runs once for each branch that is pushed.
- The post-receive hook is run at the end of the process and can be used to update other system services or notify users
conclusion
This article is my understanding of Git, and some details are not neglected. If found, I will update and supplement some schematic diagrams in time.
reference
- Pro Git
- Git Reference