In this article, you will find: Git LFS introduction, principle analysis, download and install, and basic use methods
1. Brief introduction
During the development process, you may need to version large binaries such as images, audio and video, designs, and so on
But any minor change to these large files can result in huge commits, causing the warehouse to expand rapidly and ultimately make it unusable
Cloning the repository is more time consuming because the entire history of the repository needs to be transferred to the client during the cloning process
Fortunately, a Git extension (Git Large File Storage, Git LFS) developed by Github solves this problem nicely
2. Principle analysis
How does Git LFS solve the problem?
Git LFS doesn’t magically handle large files, it just shifts the burden to remote servers, keeping the local repository relatively lean
First, it stores large files in space outside of the main repository, where only lightweight Pointers are kept, which greatly reduces the size of the repository
Second, large files are downloaded at Checkout, not clone or fetch, so you only need to download the required version instead of the historical version
Here’s a more detailed breakdown (see Atlassian Blog, Atlassian is one of the leading developers of Git LFS)
- When you add a file to the repository (
git add
Git LFS replaces its content with Pointers and stores the real content in the local LFS cache located in the repository.git/lfs/objects
Directory) - When you push a new commit to the remote end (
git push
), the Git LFS file referenced by the new commit will be transferred directly from the local LFS cache to the remote LFS store bound to the Git repository. - When you check out a commit that contains a Git LFS pointer (
git checkout
), the pointer file will be replaced with a file in the local LFS cache or downloaded from the remote LFS store
3. Download and install
The following is the download method for Windows users, other system users may be slightly different, please refer to here
- Download the installation package (this is v2.13.2, you can also find the latest download on the official website)
- Run the installation package and install Git LFS
- Execute at terminal
git lfs install
Initial Git LFS
Git LFS can be used to manage large files using Git LFS
4. Basic use
Common commands:
Git LFS track
: adds the specified file mode to Git LFS managementGit LFS untrack
: Removes the specified file mode from Git LFS managementgit lfs track
: Displays the list of matches currently managed by Git LFSgit lfs status
: Displays the status of the objects managed by Git LFSGit LFS clone
: Clone the remote repository
Create a new Git LFS repository
After you initialize a Git repository with git Init, you also need to run Git LFS install
This installs a pre-push hook in the repository that will transfer the LFS file to the server when git push is performed
Clone an existing Git LFS repository
You can still clone a repository using git clone, and Git will check out the default branches and download all the associated LFS files one by one
A better approach, however, would be to use the git LFS clone command explicitly for better performance
This is because instead of downloading LFS files one at a time, the command downloads them in batches after checkout, which takes full advantage of parallel downloads
(3) Tracking files
You can use git LFS track < file mode > to add the specified file mode to git LFS management
After running the above command, a.gitAttributes file appears in the directory that holds information about the managed content
Git LFS automatically creates and updates the.gitAttributes file, but you’ll need to manually commit the changes to the repository
If you no longer need to manage a file mode, you can use the git LFS untrack < file mode > command to remove it from git LFS management
(4) Check the status
There are two ways to view the status of Git LFS management content
Git LFS track displays the list of matches managed by git LFS
Git LFS status is used to check the status of the objects managed by git LFS
5. Other supplements
Github already forbids uploading large files larger than 100 megabytes. What if you’ve already submitted such files locally?
An effective solution is to use the Git filter-repo tool to weed out all historical commits that do not meet the requirements
Start by installing Git filter-repo via PIP
pip install git-filter-repo
Copy the code
You can then use this tool to weed out files larger than 100M from all historical commits
git filter-repo --strip-blobs-bigger-than 100M
Copy the code
Git filter-repo can be used for more specific purposes in the documentation