preface
The main purpose of this article is to help you understand the modern package management tool, PNPM. It will also summarize the conclusions about the potential defects of NPM/YARN and help you understand how PNPM solves the design defects of NPM/YARN and how PNPM can be improved.
NPM/YARN dependency management
This section introduces the disadvantages of NPM/YARN dependency management to help you learn about PNPM in the future.
In the early
Using the earlier NPM1/2 install dependencies, the node_modules folder is recursed, with dependencies installed into their respective node_modules strictly following the package.json structure and the package.json structure of the secondary dependencies. Until secondary dependencies no longer depend on other modules.
Foo relies on bar as a secondary dependency, which is installed in foo’s node_modules, as follows:
├─ trash ├─ ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trash ├─ trashCopy the code
Assuming that two dependencies in a project depend on the same secondary dependencies, their secondary dependencies will be installed repeatedly.
Node_modules ├ ─ foo1 │ ├ ─ index. Js │ ├ ─ package. The json │ └ ─ node_modules │ └ ─ bar │ ├ ─ index. The js │ └ ─ package. The json └ ─ foo2 ├ ─ index. Js ├ ─ package. Json └ ─ node_modules └ ─ bar ├ ─ index. The js └ ─ package. The jsonCopy the code
This is a simple example, and the problem could be worse in a real development scenario:
-
If the dependency level is too deep, the file path will be too long (some problems can occur on Windows
-
Duplicate packages are installed, causing the node_modules file to be huge and take up too much disk space
The turning point
Since the beginning of NPM ‘ ‘3/ YARN, the dependency management method has changed a lot compared with NPM1/2. It is no longer “nested” but “flat” to manage project dependencies.
Continuing with the above example, foo1 and foo2 both rely on the bar, and after installing the dependency, the following flat directory is displayed:
Node_modules ├ ─ bar │ ├ ─ index. Js │ └ ─ package. The json ├ ─ foo1 │ ├ ─ index. The js │ └ ─ package. The json └ ─ foo2 ├ ─ index. The js └ ─ package.jsonCopy the code
The flat table of contents does address some of the problems exposed in the previous section, but it also exposes new ones:
- Phantom dependencies
Called “ghost dependencies,” they refer to packages within a project that are not defined in package.json. This problem is manifested in NPM3, because the early tree structure led to the problem of dependency redundancy and path too deep, nPM3 later adopted a flat structure, some third party packages to the same level of secondary dependencies.
Ghost dependencies can lead to unexpected errors if they occur, so be careful:
-
Incompatible versions (such as a major update to an API)
-
It is possible to lose dependencies (a dependency no longer depends on ghost dependencies present in our project
// Bar is a ghost dependency because it is a secondary dependency of foo1 and foo2. import bar from 'bar';Copy the code
- NPM doppelgangers
Called “doppelgant dependencies” and common in Monorepo projects, third-party packages that rely on in the project and packages of the same name that those third-party packages depend on are repeatedly installed.
Common problems:
-
Project packaging packages all of these “heavy” dependencies, increasing the volume of the product
-
Library instances cannot be shared, so you get two separate instances referenced
-
Repeating TypeScript types can cause type conflicts
In practice, if foo1, foo2, and bar depend on [email protected] and [email protected], dependency conflicts will occur, and conflicts will be resolved by placing the corresponding conflict packages in node_mudules of the corresponding dependency directory. Similar to the following structure:
Node_modules ├ ─ [email protected] │ ├ ─ index. Js │ └ ─ package. The json ├ ─ foo1 │ ├ ─ index. The js │ ├ ─ package. The json │ └ ─ node_modules │ └ ─ [email protected] │ ├ ─ index. Js │ └ ─ package. The json ├ ─ foo2 │ ├ ─ index. The js │ ├ ─ package. The json │ └ ─ node_modules │ └ ─ [email protected] │ ├─ ├─ download. TXT │ ├─ download. TXTCopy the code
In this case, you may find that foo1 and boo2 node_modules both have the same version of [email protected], which is the problem we are talking about with “dopant dependency”.
You may also have another doubt, what is not flat [email protected], this can reduce the space taken up by a copy, but also can solve the problem of “double dependency”. This is because exactly who is flattened depends on the order of dependencies. Because developers don’t care about the order of dependencies, there is a lot of uncertainty.
conclusion
- The flattened node_modules structure allows access to dependencies not declared in package.json.
- The installation efficiency is low, and a large number of dependencies are repeatedly installed, occupying high disk space.
- Packages that have been installed between multiple FE projects cannot be shared and are reinstalled each time.
Show time
What is it?
PNPM is an NPM-compliant JavaScript package management tool with significant improvements in dependency installation speed and disk space utilization. It is very similar to NPM/YARN in that they both use the same package.json file to manage dependencies, and like NPM/YARN uses lock files to ensure consistency of dependency versions across multiple machines.
Performance benchmark
Official benchmarks benchmarks performance benchmarks of NPM, PNPM, YARN, and yarnPnP across multiple scenarios, covering a wide range of usage scenarios:
clean install
: New installation, no lock files, no cache, no installation dependencies.
with cache
.with lockfile
.with node_modules
: After the first installation, run the installation command again.
with cache
.with lockfile
: When the developer retrieves the REPO and runs the installation for the first time.
with cache
: Same as above, but package manager does not have a lock file available.
with lockfile
: when installed running on a CI server.
with cache
.with node_modules
: The lock file is deleted. Run the install command again.
with node_modules
.with lockfile
: The package cache is deleted. Run the install command again.
with node_modules
: The package cache and lock files are deleted. Run the install command again.
update
: By changing the versionpackage.json
And run the install command again to update your dependencies.
\
The performance baseline report based on multiple application scenarios and official output shows that the efficiency of PNPM is much higher than that of NPM/YARN.
This is what I installed using PNPM, and the table below shows the size of node_modules after PNPM dependencies are installed, and the effect is quite remarkable. In addition to the efficiency of dependent installation, PNPM greatly saves disk space. According to official statistics, the efficiency of PNPM is twice as high as that of NPM/YARN.
yarn | pnpm | |
---|---|---|
EV background | 2.6 GB | 1.6 GB |
Operating the background | 1.4 GB | 524MB |
Depend on the installation
With PNPM installation, PNPM stores dependencies in the ~/.pnpm-store directory. As long as you are on the same machine, PNPM will check the Store directory the next time you install dependencies, and if there are dependencies you need, it will drop them into your project via a hard link rather than reinstall them.
You can also run the following command to obtain the store path: PNPM ‘ ‘store path
Here is the input for the package management repeat installation:
PNPM also does a better job of making the output easier to understand, seeing how many packages you’ve reused and how many packages you need to re-download. Conversely, YARN will display all associated packages, but we do not care about these very high probability.
$ pnpm i express Packages: +52 ++++++++++++++++++++++++++++++++++++++++++++++++++++ Progress: Resolved in Flue 52, Downloaded 0, Added 0, Done Dependencies: + Express 4.17.1Copy the code
$YARN Add Express Yarn Add v1.22.11 [1/4] 🔍 considerations... (2/4) 🚚 Fetching packages... [three] 🔗 Linking dependencies... [4/4] 🔨 Building fresh packages... Success Saved lockfile. Success Saved 29 new dependencies. Info Direct dependencies ├ ─ [email protected] info All Dependencies Exercises - [email protected] Exercises -... ├ ─ [email protected] ✨ Done in 1.7sCopy the code
NPM I Express NPM WARN [email protected] No description NPM WARN [email protected] No repository field. + [email protected] added 50 Packages from 37 ficol3 and Audited 50 packages in 4.309s found 0 DecemberCopy the code
Principle of dependency Management
As mentioned above, with PNPM installation, PNPM stores dependencies in the Store directory. This directory is actually played a critical role, it is to solve the multiple FE program has been installed cannot be Shared between and reinstall every time they install dependencies, so for packages have been installed can directly reuse between multiple projects, rather than the NPM/yarn is to reinstall them every time.
Try the flow chart below for some extra content. There may be some do not understand, do not worry about the following will focus on the introduction.
Lead to understand
It is important to understand hard Link and Soft Link before you begin to understand how PNPM reuses and shares dependencies.
hard link & soft link
There are two types of links in Linux:
-
Hard Link
-
Soft link, also known as symbolic link
inode
Each file has a unique inode, which contains meta information about the file. When accessing the file, the corresponding meta information is copied to the memory for file access.
You can use the stat command to view meta information about a file.
stat README.md
Copy the code
hard link
A hardlink can be understood as a pointer to each other. The hardlink is created to point to the inode of the source file, and the system does not reassign the inode to it.
No matter how many hard links there are, they all point to the same inode node, which means that when you modify the source file or link file, the changes are synchronized.
Each new hardlink increases the number of node connections. As long as the number of node links is non-zero, the file will always exist, regardless of whether you delete the source file or the HradLink. Files exist as long as one exists (similar to the concept of reference counting.
\
soft link
Soft link can be understood as a one-way pointer, which is an independent file with a separate inode and always points to the source file, which is similar to the shortcut in Windows system.
Hard link mechanism
With PNPM installation, PNPM stores project dependencies in the global Store directory, as discussed above. Imagine a scenario where I have a new project and am about to install the dependencies required for the project. PNPM does this. If there is a dependency in the store directory that is about to be downloaded, the download will be skipped and a hard link will be made to the corresponding project node_modules instead of reinstalling it. Here’s why PNPM performs so well, saving the most time and disk space.
Node_modules based on soft links
The node_modules output by PNPM is quite different from NPM/YARN. It is not a “flat directory” like the previous one, but a “non-flat directory”. If that sounds crazy, let’s move on.
Node_modules ├─ PNPM │ ├─ [email protected] ├─Copy the code
Well, it looks like this. You may be wondering, what additional benefits will come from such a design? At this point you need to know that PNPM’s node_modules directory uses soft links to create nested structures of dependencies and reference targets, and that each file in.pnpm is a hard link from a content-addressable store, like the following:
PNPM contains all the dependencies we need for our project, they are real files, the only difference is the hard link from the store directory (if you don’t understand the hard link, please read the “pre-understanding” node).
Node_modules └ ─ the PNPM └ ─ [email protected] └ ─ node_modules └ ─ dayjs - > < store > / dayjs ├ ─ index. The js └ ─ package. The jsonCopy the code
If you open the.pnpm directory you’ll see that these dependencies are “flattened”, but strangely, they all carry their own version numbers. I understand that the purpose of PNPM’s design is to solve the problem of “doppelganger dependence”.
Suppose we have a scenario where the project relies on [email protected] and [email protected]. Bar also relies on [email protected] and its reference relationship looks like this:
Node_modules ├ ─ foo - >. / PNPM/[email protected] / node_modules/foo └ ─. PNPM ├ ─ [email protected] │ └ ─ node_modules │ ├ ─ foo - >.. /.. /[email protected]/ conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/conf/confCopy the code
The.pnpm directory is a directory that can be used to create the.pnpm. In fact, it is easy to understand that outside of the directory, we actually refer to dependencies in daily development, but it is a soft link to PNPM, and ultimately the soft link actually refers to real dependencies in.pnpm.
We can list the specific points of these soft links at the terminal with commands:
Go to node_modules and run the command
ls -al
Copy the code
Let’s think about why we need soft links to reference actual dependencies. This is designed to solve the problem of “ghost dependencies”, where only declared dependencies appear as soft links in the node_modules directory. In real projects, soft links are referenced, which point to real dependencies of.pnpm, so packages that are not declared in package.json are not referenced in daily development.
Workspace
The majority of modern front-end engineering uses Lerna to manage Monorepo-type projects, and everyone knows its role and PNPM is friendly in supporting it. Unlike Lerna, PNPM uses special package selector syntax to restrict commands, unlike Lerna, which requires long and difficult to remember commands to identify.
For a Monorepo project, the directory must have a configuration file to manage the workspace. Workspace files are similar to those of other package management tools. Some commands commonly used to manage Monorepo will be introduced later.
Packages: # all packages - 'packages/**' - 'components/**' - 'components/**' **/test/**'Copy the code
- Select a REPO precisely
<@scope/package>
, or select a group of REPOs<@scope/*>
Or relative path selection.
pnpm dev --filter @byted-ehi/basic-list
pnpm dev --filter apps/*
pnpm dev --filter ./apps/admin-order-manage
Copy the code
- Select a REPO and the dependencies of the owning REPO, for example, will run
basic-list
All dependencies underdev
.
pnpm dev --filter @byted-ehi/basic-list...
Copy the code
- Select only a dependency of a REPO, the difference being that the REPO is not included. For example, all dependencies under repO will be run
dev
, excluding the REPO itself.
pnpm dev --filter @byted-ehi/basic-list^...
Copy the code
- Select all the REPOs in the specified directory.
pnpm dev --filter ./apps
Copy the code
So how PNPM manages the Monorepo project, let’s first introduce here. The official website also has some command methods for some feature scenarios, you can check for yourself. In fact, PNPM as monorepo management of the project is not inferior to Lerna, but somewhat comfortable in use. Instead of memorizing a bunch of commands, PNPM allows you to just remember filter for most development scenarios.
PNPM lock file
The PNPM output is a lock file in pnPm-lock. yaml format, which is not very different from NPM/YARN. PNPM provides commands to generate PNPM format lock files based on the original lock files of the project, so as to avoid the problem that migration to PNPM will cause some fixed dependent versions to change.
Supported lock files:
- packag-lock.json
- npm-shrinkwrap.json
- yarn.lock
pnpm import
Copy the code
Basic commands
PNPM may feel very complicated after the whole chapter, but in practice it is the opposite. It costs almost nothing and you can get used to PNPM very quickly.
$ pnpm install express
$ pnpm update express
$ pnpm remove express
$ pnpm list
$ pnpm run <scripts>
$ pnpm publish
Copy the code
For additional
The package is stored in the store, why is my node_modules still taking up disk space?
PNPM creates a hard link from store to the project’s node_modules folder, but the hard link essentially shares the same inode as the original file. Therefore, they both actually share the same space and seem to occupy node_modules space. All will always take up one piece of space, not two.
The sensory
PNPM is a high-performance package management tool that solves some potential problems of NPM/YARN. Finally, if you want to read this article, you can practice PNPM and understand its real features and capabilities.
Here you can review the PNPM dependency management schematic mentioned above to understand it:
Finally, I hope this article is useful to you
reference
pnpm.io/zh/
PNPM. IO/useful/limitati…
www.takeshape.io/articles/wh…
Javascript. Plainenglish. IO/an – abbrevia…
Javascript. Plainenglish. IO/what – is – PNP…