This article shares one of the best package managers in the industrypnpm
. GitHub already has Star 9.8K, which is now relatively mature and stable. It is derived from NPM/YARN, but solves potential internal bugs of NPM/YARN, greatly optimizes performance, and expands usage scenarios. Here’s a mind map for this article:
What is PNPM?
Here’s what the official PNPM documentation says:
Fast, disk space efficient package manager
As a result, PNPM is essentially a package manager, no different from NPM/Yarn, but it has two powerful advantages:
- Package installation speed is very fast;
- Disk space utilization is very efficient.
It is also very simple to install. How simple could it be?
npm i -g pnpm
Copy the code
2. Feature overview
1. The speed is fast
How fast can you install a PNPM package? Take the React package for example:
PNPM in yellow is two to three times faster than NPM/YARN in most scenarios.
If you are familiar with YARN, you may say, does YARN have PnP installation mode? Remove node_modules and write dependencies to disk. This saves node file I/O overhead and speeds up installation. (See this article for details.)
Next, let’s take such a warehouse as an example, let’s take a look at the Benchmark data, mainly compare PNPM and YARN PnP:
In general, the package installation speed of PNPM is much faster than that of YARN PnP.
2. Use disk space efficiently
PNPM internally uses a content-based file system to store all of the files on disk. The nice thing about this file system is:
-
The same package is not installed repeatedly. With NPM/YARN, if 100 projects depend on Lodash, then lodash is likely to have been installed 100 times and written in 100 places on the disk. But in the use of PNPM will only install once, there is only one place to write in the disk, after the use of again will directly use hardlink(hardlink, unclear students see this article).
-
Even if there are different versions of a package, PNPM will significantly reuse the code from the previous version. For example, if lodash has 100 files and an updated version adds one file, instead of writing 101 files to the disk, the hardlink keeps the original 100 files and writes only the new file.
3. Support monorepo
As front-end engineering becomes more complex, more and more projects are using Monorepo. Before, we usually use multiple Git repositories to manage multiple projects, but Monorepo’s purpose is to use one Git repository to manage multiple sub-projects. All sub-projects are stored in the root packages directory, so a sub-project represents a package. If you haven’t been exposed to the Monorepo concept before, take a closer look at this article and the open source Monorepo management tool Lerna. For the project directory structure, see the Babel repository.
Another big difference between PNPM and NPM /yarn is that it supports monorepo, which is reflected in the function of each subcommand. For example, PNPM add A -r in the root directory, then A dependency will be added to all packages. The –filter field is also supported to filter packages.
High security
When NPM/YARN is used, if A depends on B and B depends on C due to the flat node_module structure, C can be directly used in A. However, THE problem is that C dependency is not declared in A. Therefore, there will be such cases of illegal access. However, PNPM has great imagination and created a set of dependency management methods to solve this problem and ensure security. How to reflect security and avoid the risk of illegal access dependence will be discussed in detail later.
Third, dependency management
NPM/yarn install principle
It is divided into two parts. First, how the package reaches the project node_modules after the NPM /yarn install is executed. Second, how dependencies are managed internally in node_modules.
After the command is executed, the dependency tree is first built, and the following four steps are taken for each package under each node:
-
- Resolves the dependency package version range to a specific version number
-
- Download the tar package that the version depends on to the local offline image
-
- Decompress the dependency from the offline image to the local cache
-
- Copy dependencies from the cache to the node_modules directory of the current directory
The corresponding package then reaches the node_modules of the project.
So, what is the directory structure of these dependencies inside node_modules, in other words, what is the dependency tree of the project?
In NPM1 and NPM2, a nested structure is presented, such as this:
Node_modules ├─ foo ├─ index.js ├─ package.json ├─ node_modules ├─ bar ├─ index.js ├─ package.jsonCopy the code
If there are dependencies in the bar, they will continue to be nested. Consider the problem with such a design:
- If the dependency level is too deep, the file path will be too long, especially on the Window system.
- A large number of duplicate packages are installed, and the file size is extremely large. Such as with
foo
There’s one in the same directorybaz
Both rely on the same versionlodash
Then lodash will be installed separately in node_modules of both, i.e. repeatedly installed. - Module instances cannot be shared. For example, React has some internal variables. The React modules introduced in two different packages are not the same module instance, so they cannot share internal variables, resulting in some unpredictable bugs.
Then, starting with NPM3, including YARN, the problem is solved by flattening dependencies. I’m sure you all have such an experience, I obviously install express, why node_modules in so many things?
Yes, this is the result of flat dependency management. Instead of having a nested structure, the directory structure now looks like this:
Node_modules ├ ─ foo | ├ ─ index. Js | └ ─ package. The json └ ─ bar ├ ─ index. The js └ ─ package. The jsonCopy the code
All dependencies are flattened to node_modules, and there are no more deep nesting relationships. In this way, when installing a new package, according to the node require mechanism, the node will keep looking for the upper node_modules. If the same version of the package is found, the package will not be reinstalled, which solves the problem of repeated installation of a large number of packages, and the dependency level will not be too deep.
The previous problem was solved, but if you think about this flattening approach, is it really watertight? And it isn’t. It still has many problems. Let’s comb through:
-
- Dependent structural uncertainty.
-
- The flattening algorithm itself is very complex and time-consuming.
-
- Packages with no declared dependencies can still be accessed illegally in a project
The last two are easy to understand, so what does uncertainty mean in the first point? Here’s a detailed explanation.
Suppose now that the project depends on two packages foo and bar, the dependencies of these two packages look like this:
What is the following after flattening during NPM/YARN install
Or is it?
The answer is: both. Depending on where foo and bar are in package.json, if foo is declared first, then it is the preceding structure, otherwise it is the following structure.
This is why the dependency structure is uncertain and why the lock file, whether package-lock.json(NPM 5.x) or Yarn.lock, is created to ensure that the node_modules structure is specified after install.
However, NPM/YARN still suffers from complex flattening algorithms and illegal package access, which affects performance and security.
PNPM dependency management
PNPM author Zoltan Kochan found that Yarn was not intended to solve these problems, so he started from scratch, wrote a new package manager, and created a new dependency management mechanism. Now let’s take a look.
Again using express as an example, let’s create a new directory and execute:
pnpm init -y
Copy the code
Then execute:
pnpm install express
Copy the code
Let’s look at node_modules again:
.pnpm
.modules.yaml
express
Copy the code
We see express directly, but it’s worth noting that this is just a soft link. If you look at it, there’s no node_modules directory in it. If it’s the actual file location, it won’t be able to find dependencies based on Node’s package loading mechanism. So where is it really located?
We continue our search in.pnpm:
▾ node_modules ▾. PNPM ▸ [email protected] ▸... ▾ [email protected] ▾ node_modules ▸ accept ▸ array-flatten [email protected] body-parser editor content-disposition... ▸ etag ▾ express ▸ lib history.md index.js LICENSE package.json readme.mdCopy the code
Good boy! PNPM /[email protected]/node_modules/express
Open any other bag:
PNPM,.pnpm,.pnpm,.pnpm,.pnpm,.pnpm,.pnpm,.pnpm,.pnpm,.pnpm
▾ node_modules ▾. PNPM ▸ [email protected] ▸... ▾ [email protected] ▾ node_modules ▸ ->.. /[email protected]/node_modules/accept ▸ array-flatten ->.. / [email protected] / node_modules/array - flatten... ▾ express ▸ lib history.md index.js LICENSE package.json readme.mdCopy the code
Putting the package itself and its dependencies under the same node_module, which is fully compatible with native Node, is a great way to organize the package and its dependencies together.
Now let’s go back and see that the node_modules directory in the root directory is no longer a dizzying array of dependencies, but basically the same dependencies as the package.json declaration. The node_modules root directory is cleaner and more standardized than it used to be, even though some packages within PNPM will be set up to be promoted to the node_modules root directory.
Four, talk about safety
If you haven’t noticed, PNPM’s dependency management approach also avoids the problem of illegal access to dependencies. As long as a package does not declare dependencies in package.json, it is not accessible in the project.
If A is dependent on B and B is dependent on C, then A does not declare C’s dependency on A, and C is loaded into A’s node_modules, then I use C in A. It runs without any problems. After I go online, it can also run normally. Isn’t it safe?
Not really.
First, you need to know that the version of B can change from time to time. If the new version of B depends on [email protected], then the new version of B depends on [email protected], then the 2.0.1 version of C is installed after NPM /yarn install in project A. If A is using an old version of C’s API, it might just report an error.
Second, if C is no longer needed after B is updated, then C will not be installed in node_modules.
There’s another case, in the Monorepo project, where A depends on X, B depends on X, and there’s A C that doesn’t depend on X, but it uses X in its code. NPM /yarn will place X in the root directory of node_modules, so that C can run locally. It can be loaded into X in node_modules at the root of the Monorepo project. But imagine that once C is sent out separately and the user installs C separately, then X will not be found and an error will be reported when the code that references X is executed.
These are all dependent on promoting potential bugs. If it’s your own business code, imagine if it’s a toolkit for a lot of developers, it’s very damaging.
NPM has also tried to solve this problem by specifying the –global-style parameter to disable the upgrade of variables, but this would be a throwback to the days of nested dependencies.
NPM/YARN does not seem to be able to solve the dependency problem by itself, but the community has a specific solution for this issue: dependency check, github.com/dependency-…
However, there is no denying that PNPM does more thoroughly. Its original dependency management method not only solves the security problem of dependency enhancement, but also greatly optimizes the performance in time and space.
Five, daily use
Having said all this, you may think that PNPM is quite complicated. Is it very expensive to use?
On the contrary, PNPM is simple to use, and if you have previous experience with NPM/YARN, you can even migrate to PNPM seamlessly. Let’s take some examples of everyday use.
pnpm install
Similar to NPM install, install all dependencies under the project. However, for the Monorepo project, all dependencies for all packages under workspace are installed. However, you can specify a package with the –filter parameter and only rely on packages that meet the criteria.
Of course, you can also use this to install a single package:
/ / install axios
pnpm install axios
// Install Axios and add it to devDependencies
pnpm install axios -D
// Install Axios and add axios to Dependencies
pnpm install axios -S
Copy the code
Of course, you can also specify a package via –filter.
pnpm update
To update the package to the latest version according to the specified scope, you can specify the package in the Monorepo project with –filter.
pnpm uninstall
Removes the specified dependencies from node_modules and package.json. Same as the monorepo project. Here are some examples:
// Remove axios PNPM uninstall axios --filter package-aCopy the code
pnpm link
Connect a local project to another project. Note that hard links are used, not soft links. Such as:
pnpm link .. /.. /axiosCopy the code
In addition, we often use NPM run/start/test/publish, these directly replaced by PNPM is the same, no further details. For more information, see the official document: pnpm.js.org/en/
As you can see, although PNPM does a lot of complex design internally, it is actually user-friendly and very user-friendly. In addition, now the author has been maintaining, NPM downloads last week has been 10W +, experienced the test of large scale users, stability can also be guaranteed.
Therefore, PNPM is a better solution than NPM/YARN. We expect more PNPM applications in the future.
References:
[1] PNPM official document: pnpm.js.org/en/
[2] Benchmark Warehouse: github.com/dependency-…
[3] Zoltan Kochan, Why Should We Use PNPM? : www.kochan.io/nodejs/why-…
[4] Zoltan Kochan the PNPM ‘s strictness else to get silly bugs “: www.kochan.io/nodejs/pnpm…
[5] Conarli the NPM install principle analysis: cloud.tencent.com/developer/a…
[6] yarn official documentation: classic.yarnpkg.com/en/docs
[7] the Yarn plug-in ‘n’ Play features “: loveky. Making. IO / 2019/02/11 /…
[8] “Guide to Monorepos for Front-end Code” : www.toptal.com/front-end/g…