aboutyarn
Yarn and NPM are also JavaScript package management tools. We also found CNPM, PNPM and other package management tools. If only one of them is enough, why are there so many wheels?
Why is ityarn
? What makes it different from other tools?
Tip: NPM refers to the NPM2 version
andnpm
The difference between
yarn
Downloading and installing dependency packages takes a multi-threaded approach, whilenpm
Is a single threaded way of execution, the speed gap openedyarn
Dependency packages that have been downloaded from the local cache are read from the cache first. Only when the local cache does not exist, the remote request mode is adopted. In contrast,npm
Is the full amount of requests, the speed of the gap againyarn
Laying all dependencies on the same level effectively reduces the number of duplicate downloads of the same dependencies, speeding up the download and reducing the number of downloadsnode_modules
The volume of the; In contrast,npm
Is strictly downloaded according to the dependency tree and placed in the corresponding location, resulting in the same package downloading multiple times,node_modules
Big volume problem
andcnpm
The difference between
cnpm
Domestic images are faster (other tools can also change the source address)cnpm
Gather all the downloaded packages of the project into their own cache folder and passSoft linksPut dependency packages in the corresponding project’snode_modules
In the
andpnpm
The difference between
- and
yarn
There is also a directory for managing dependencies pnpm
Retain thenpm2
Version of the original dependency tree structure, butnode_modules
All dependency packages are saved by soft link
From doing a simpleyarn
To get to knowyarn
Step 1 – Download
The JavaScript package management tool uses package.json as the entry point for dependency description.
{
"dependencies": {
"lodash": "4.17.20"}}Copy the code
In the example of package.json above, we can directly identify package.json and download the corresponding package directly.
import fetch from 'node-fetch';
function fetchPackage(packageJson) {
const entries = Object.entries(packageJson.dependencies);
entries.forEach(async ([key, version]) => {
const url = `https://registry.`yarn`pkg.com/${key}/-/${key}-${version}.tgz`.const response = await fetch(url);
if(! response.ok) {throw new Error(`Couldn't fetch package "${reference}"`);
}
return await response.buffer();
});
}
Copy the code
Now let’s look at another situation:
{
"dependencies": {
"lodash": "4.17.20"."customer-package": ".. /.. /customer-package"}}Copy the code
“customer-package”: “.. /.. “/customer-package” doesn’t work in our code anymore. So we need to make code changes:
import fetch from 'node-fetch';
import fs from 'fs-extra';
function fetchPackage(packageJson) {
const entries = Object.entries(packageJson.dependencies);
entries.forEach(async ([key, version]) => {
// Copy files directly for file path resolution
if ([` / `.`. / `.`.. / `].some(prefix= > version.startsWith(prefix))) {
return await fs.readFile(version);
}
// The non-file path directly requests the remote address
/ /... old code
});
}
Copy the code
Step 2 – Flexible matching rules
At present, our code can normally download fixed version of the dependency package, file path. However, for example, “react”: “^15.6.0” is not supported, and we know that this expression represents all package versions from version 15.6.0 to version 15.7.0. In theory we should install the latest version of the package in this range, so we add a new method:
import semver from 'semver';
async function getPinnedReference(name, version) {
// First verify that the version number conforms to the specification
if(semver.validRange(version) && ! semver.valid(version)) {// Obtain all version numbers of dependency packages
const response = await fetch(`https://registry.`yarn`pkg.com/${name}`);
const info = await response.json();
const versions = Object.keys(info.versions);
// Match the latest version number that complies with the specification
const maxSatisfying = semver.maxSatisfying(versions, reference);
if (maxSatisfying === null)
throw new Error(
`Couldn't find a version matching "${version}" for package "${name}"`
);
reference = maxSatisfying;
}
return { name, reference };
}
Copy the code
function fetchPackage(packageJson) {
const entries = Object.entries(packageJson.dependencies);
entries.forEach(async ([name, version]) => {
// Copy files directly for file path resolution
/ /... old code
let realVersion = version;
// Get the latest version of the package if the version number starts with ~ and ^
if (version.startsWith('~') || version.startsWith(A '^')) {
const { reference } = getPinnedReference(name, version);
realVersion = reference;
}
// The non-file path directly requests the remote address
/ /... old code
});
}
Copy the code
So we can support the user to specify a package to install the latest package within a dependency scope.
Step 3 – Dependencies There are also dependencies
It’s not as simple as we think. Our dependencies have their own dependencies, so we need to recurse through each layer of dependencies to download all of them.
// Get dependencies for the dependency package
async function getPackageDependencies(packageJson) {
const packageBuffer = await fetchPackage(packageJson);
// Read the dependency package 'package.json'
const packageJson = await readPackageJsonFromArchive(packageBuffer);
const dependencies = packageJson.dependencies || {};
return Object.keys(dependencies).map(name= > {
return { name, version: dependencies[name] };
});
}
Copy the code
Now we can get all the dependency packages in the dependency tree from the user project package.json.
Step 4 – Transfer files
It’s not enough to download the dependencies, we need to move all the files to the specified directory, which is known as node_modules.
async function linkPackages({ name, reference, dependencies }, cwd) {
// Get the entire dependency tree
const dependencyTree = await getPackageDependencyTree({
name,
reference,
dependencies,
});
await Promise.all(
dependencyTree.map(async dependency => {
await linkPackages(dependency, `${cwd}/ `node_modules` /${dependency.name}`); })); }Copy the code
Step 5 – Optimization
We can download all the dependencies from the whole tree and put them in node_modules, but we find that there are many duplicate dependencies, so we can put the same dependencies in one place so that we don’t need to download them again.
function optimizePackageTree({ name, reference, dependencies = [] }) {
dependencies = dependencies.map(dependency= > {
return optimizePackageTree(dependency);
});
for (let hardDependency of dependencies) {
for (let subDependency of hardDependency.dependencies)) {
// Whether the child dependency has the same dependency as the parent dependency
let availableDependency = dependencies.find(dependency= > {
return dependency.name === subDependency.name;
});
if(! availableDependency) {// Insert a dependency into the parent dependency if the parent dependency does not exist
dependencies.push(subDependency);
}
if (
!availableDependency ||
availableDependency.reference === subDependency.reference
) {
// Remove the same dependencies from the child dependencies
hardDependency.dependencies.splice(
hardDependency.dependencies.findIndex(dependency= > {
returndependency.name === subDependency.name; })); }}}return { name, reference, dependencies };
}
Copy the code
We have reduced the number of repeated dependency installations by flattening dependencies from one dependency to the next by a step-by-step recursion. At this point we have implemented simple YARN
Yarn Architecture
The most intuitive thing to look at is the codeyarn
The idea of object-oriented play incisively and vividly
- Config:
yarn
Related Configuration Examples - cliAll:
yarn
Command set instance - registries:
npm
Source related information instances- It involves locking files, parsing dependency package entry file names, dependency package storage locations and file names, etc
- lockfile:
yarn.lock
object - Intergrity checker: Checks whether the dependency package has been downloaded correctly
- package resolver: used for parsing
package.json
Dependencies are referenced in different ways- Package Request: dependency package version request instance
- Package Reference: package reference instance
- Package fetcher: dependency package download instance
- Package linker: manages dependency package files
- Package Hoister: A flat instance of dependency packages
yarn
The working process
Flow profile
Here we use yarn add Lodash as an example to take a look at what Yarn does internally. Yarn installation consists of the following five steps:
- checking: Check configuration items (
.yarnrc
, command line arguments,package.json
), compatibility (CPU, NodeJS version, operating system, etc.) - ResolveStep: resolves the dependency package information and the specific version information of all the packages in the dependency tree
- FetchStep: Download all the dependency packages. If the dependency packages already exist in the cache, skip the download, otherwise download the corresponding dependency packages to the cache folder. When this step is complete, all the dependency packages have been cached
- LinkStep: Make a flat copy of cached dependency packages to the dependency directory of the project
- BuildStep: For some binary packages, you need to compile and do so in this step
Process on
Let’s continue with yarn add lodash as an example
Initialize the
To find theyarnrc
file
// Obtain the configuration of the 'yarn' rc file
// process. CWD Project directory of the current command
Argv Specifies the 'yarn' command and parameters
const rc = getRcConfigForCwd(process.cwd(), process.argv.slice(2));
/** * generate all paths where the Rc file may exist *@param {*} Name rc Source name *@param {*} CWD Current project path */
function getRcPaths(name: string, cwd: string) :Array<string> {
/ /... other code
if(! isWin) {// In a non-Windows environment, start the search from /etc/' yarn '/config
pushConfigPath(etc, name, 'config');
// In a non-Windows environment, start the search from /etc/' yarn 'rc
pushConfigPath(etc, `${name}rc`);
}
// A user directory exists
if (home) {
// 'yarn' Configures routes by default
pushConfigPath(CONFIG_DIRECTORY);
// User directory /. Config /${name}/config
pushConfigPath(home, '.config', name, 'config');
// User directory /. Config /${name}/config
pushConfigPath(home, '.config', name);
${name}/config
pushConfigPath(home, `.${name}`.'config');
${name}rc
pushConfigPath(home, `.${name}rc`);
}
${name} ${name} ${name} ${name
// Tip: Rc files written by users have the highest priority
while (true) {
// Insert - current project path /.${name}rc
unshiftConfigPath(cwd, `.${name}rc`);
// Get the parent path of the current project
const upperCwd = path.dirname(cwd);
if (upperCwd === cwd) {
// we've reached the root
break;
} else{ cwd = upperCwd; }}/ /... read rc code
}
Copy the code
Parse the instructions entered by the user
/** * -- index position */
const doubleDashIndex = process.argv.findIndex(element= > element === The '-');
/** * The first two parameters are node address and yarn file address */
const startArgs = process.argv.slice(0.2);
/** * 'yarn' subcommand & parameter * If it exists, take the part before -- * if it does not exist, take all */
const args = process.argv.slice(2, doubleDashIndex === -1 ? process.argv.length : doubleDashIndex);
/** * Transparent transmission of parameters of the 'yarn' subcommand */
const endArgs = doubleDashIndex === -1 ? [] : process.argv.slice(doubleDashIndex);
Copy the code
Example Initialize a shared instance
During initialization, the config configuration item and reporter log are initialized respectively.
- Config will recursively query the parent step by step during init
package.json
Is it configured?workspace
field- Tip: If the current value is
workspace
The project,yarn.lock
Based onworkspac
E For the root directoryyarn.lock
Shall prevail
- Tip: If the current value is
this.workspaceRootFolder = await this.findWorkspaceRoot(this.cwd);
// 'yarn'. Lock directory. Priority is the same as the workspace directory
this.`lockfile`Folder = this.workspaceRootFolder || this.cwd;
/** * Find the workspace root directory */
async findWorkspaceRoot(initial: string): Promise<? string> {let previous = null;
let current = path.normalize(initial);
if (!await fs.exists(current)) {
// No error is reported
throw new MessageError(this.reporter.lang('folderMissing', current));
}
// Loop step by step to the parent directory to check whether 'package.json' \ 'yarn'
// If workspace is configured at any level, return the path where the JSON is located
do {
/ / remove ` package. Json ` \ ` yarn `. Json
const manifest = await this.findManifest(current, true);
// Remove the workspace configuration
const ws = extractWorkspaces(manifest);
if (ws && ws.packages) {
const relativePath = path.relative(current, initial);
if (relativePath === ' ' || micromatch([relativePath], ws.packages).length > 0) {
return current;
} else {
return null;
}
}
previous = current;
current = path.dirname(current);
} while(current ! == previous);return null;
}
Copy the code
Execute the add instruction
- That’s from the previous step
yarn.lock
Read the addressyarn.lock
File. - According to the
package.json
The life cycle execution corresponds toscript
The script
/** * execute in the lifecycle order configured by 'package.json' script */
export async function wrapLifecycle(config: Config, flags: Object, factory: () => Promise<void>) :Promise<void> {
/ / preinstall execution
await config.executeLifecycleScript('preinstall');
// Actually perform the installation
await factory();
/ / install
await config.executeLifecycleScript('install');
/ / execution postinstall
await config.executeLifecycleScript('postinstall');
if(! config.production) {// Non-production environment
if(! config.disablePrepublish) {/ / prepublish execution
await config.executeLifecycleScript('prepublish');
}
Prepare / / execution
await config.executeLifecycleScript('prepare'); }}Copy the code
Obtaining project dependencies
- First get the current directory
package.json
ηdependencies
,devDependencies
,optionalDependencies
Name + version number of all dependencies in- If the current is
workspace
Items are read from the project root directorypackage.json
- Because the current is
workspace
Item, you also need to readworkspace
Of all the subprojects in the projectpackage.json
Correlation dependence of
- Because the current is
- If the current is
// Get all dependencies in the current project directory
pushDeps('dependencies', projectManifestJson, {hint: null.optional: false}, true);
pushDeps('devDependencies', projectManifestJson, {hint: 'dev'.optional: false},!this.config.production);
pushDeps('optionalDependencies', projectManifestJson, {hint: 'optional'.optional: true}, true);
// The current is a workspace project
if (this.config.workspaceRootFolder) {
// Collect 'package.json' for all subprojects in the Workspace
const workspaces = await this.config.resolveWorkspaces(workspacesRoot, workspaceManifestJson);
for (const workspaceName of Object.keys(workspaces)) {
// Subproject 'package.json'
const workspaceManifest = workspaces[workspaceName].manifest;
// Place the subproject in the root Project Dependencies dependency
workspaceDependencies[workspaceName] = workspaceManifest.version;
// Collect subproject dependencies
if (this.flags.includeWorkspaceDeps) {
pushDeps('dependencies', workspaceManifest, {hint: null.optional: false}, true);
pushDeps('devDependencies', workspaceManifest, {hint: 'dev'.optional: false},!this.config.production);
pushDeps('optionalDependencies', workspaceManifest, {hint: 'optional'.optional: true}, true); }}}Copy the code
ResolveStep Obtains dependency packages
- To iterate over the first layer dependency, call
package resolver
ηfind
Method to get the version information of the dependency package and then call it recursivelyfind
For each dependencydependence
Depends on the version information in. Use one while parsing the packageSet(fetchingPatterns)
To save parsed and parsedpackage
. - Parse each in detail
package
, first according to itsname
εrange
(version range) Determine whether the current dependency package is resolved (by determining whether it exists in the maintained aboveset
, you can determine whether it has been parsed. - For unparsed packages, first try from
lockfile
To obtain the exact version information iflockfile
Package information for exists in, and is marked as parsed after being obtained. iflockfile
Does not exist inpackage
, a request is made to Registry for the highest known version that satisfies the rangepackage
Information will be obtained after the currentpackage
Marked as parsed - For parsed packages, they are placed on a delay queue
delayedResolveQueue
Do not deal with first - When dependent on all of the tree
package
When you’re done recursively, iterate againdelayedResolveQueue
, from the package information that has been parsed, find the most appropriate version information available
After that, we have determined the exact version of all packages in the dependency tree, along with details such as the package address.
- Get the latest version number for all the dependencies of the first level project (call
package resolver
ηinit
Methods)
/** * Find the dependency package version */
async find(initialReq: DependencyRequestPattern): Promise<void> {
// Read from cache first
const req = this.resolveToResolution(initialReq);
if(! req) {return;
}
// The dependency package requests the instance
const request = new PackageRequest(req, this);
const fetchKey = `${req.registry}:${req.pattern}:The ${String(req.optional)}`;
// Check whether the same dependency package has been requested
const initialFetch = !this.fetchingPatterns.has(fetchKey);
// Whether to update the 'yarn'. Lock flag
let fresh = false;
if (initialFetch) {
// Add cache on first request
this.fetchingPatterns.add(fetchKey);
// Get the dependency name + version in 'lockfile'
const `lockfile`Entry = this.`lockfile`.getLocked(req.pattern);
if (`lockfile`Entry) {
// The contents of 'lockfile' exist
// Fetch the dependent version
// eq: concat-stream@^1.5.0 => {name: 'concat-stream', range: '^1.5.0', hasVersion: true}
const {range, hasVersion} = normalizePattern(req.pattern);
if (this.is`lockfile`EntryOutdated(`lockfile`Entry.version, range, hasVersion)) {
// The version of 'yarn'. Lock is behind
this.reporter.warn(this.reporter.lang('incorrect`lockfile`Entry', req.pattern));
// Delete the dependency version number that has been collected
this.removePattern(req.pattern);
// Delete package version information from 'yarn'. Lock (it is obsolete and invalid)
this.`lockfile`.removePattern(req.pattern);
fresh = true; }}else {
fresh = true;
}
request.init();
}
await request.find({fresh, frozen: this.frozen});
}
Copy the code
- Do a recursive dependency query for the requested dependency package
for (const depName in info.dependencies) {
const depPattern = depName + '@' + info.dependencies[depName];
deps.push(depPattern);
promises.push(
this.resolver.find(......),
);
}
for (const depName in info.optionalDependencies) {
const depPattern = depName + '@' + info.optionalDependencies[depName];
deps.push(depPattern);
promises.push(
this.resolver.find(.......),
);
}
if (remote.type === 'workspace' && !this.config.production) {
// workspaces support dev dependencies
for (const depName in info.devDependencies) {
const depPattern = depName + '@' + info.devDependencies[depName];
deps.push(depPattern);
promises.push(
this.resolver.find(.....),
);
}
}
Copy the code
FetchStep Downloads dependency packages
This is mainly about downloading dependencies that are not in the cache.
- Dependencies already in the cache do not need to be re-downloaded, so the first step is to filter out dependencies that already exist in the local cache. The filtration process is based on
cacheFolder+slug+node_modules+pkg.name
To generate apath
, and determine whether thepath
If it exists, prove that there is a cache, do not re-download, filter it out. - Maintain a
fetch
The task ofqueue
, according to * *resolveStep
** resolve the dependency download address to obtain the dependencies. - When each package is downloaded, its corresponding cache directory is created in the cache directory, and the reference address of the package is resolved.
- because
reference
For example: NPM source, Github source, GitLab source, file address, etcyarn
Depending on thereference
Address call corresponding tofetcher
Obtaining dependency packages - To obtain the
package
Document circulationfs.createWriteStream
Write to the cache directory, the cache is.tgz
Compress the file and then decompress it to the current directory - After the download is decompressed, update
lockfile
file
/** * Splicing cache dependency path * Cache path + 'NPM' source - package name - version -integrity + 'node_modules' + package name */
const dest = config.generateModuleCachePath(ref);
export async function fetchOneRemote(remote: PackageRemote, name: string, version: string, dest: string, config: Config,) :Promise<FetchedMetadata> {
if (remote.type === 'link') {
const mockPkg: Manifest = {_uid: ' '.name: ' '.version: '0.0.0'};
return Promise.resolve({resolved: null.hash: ' ', dest, package: mockPkg, cached: false});
}
const Fetcher = fetchers[remote.type];
if(! Fetcher) {throw new MessageError(config.reporter.lang('unknownFetcherFor', remote.type));
}
const fetcher = new Fetcher(dest, remote, config);
// Check whether the file exists based on the address passed in
if (await config.isValidModuleDest(dest)) {
return fetchCache(dest, fetcher, config, remote);
}
// Delete files in the corresponding path
await fs.unlink(dest);
try {
return await fetcher.fetch({
name,
version,
});
} catch (err) {
try {
await fs.unlink(dest);
} catch (err2) {
// what do?
}
throwerr; }}Copy the code
LinkStep Moves a file
After the fetchStep, we have all the dependencies in the local cache. The next step is how to copy them to node_modules in our project.
- The package is parsed before being copied
peerDependences
If no match is foundpeerDependences
,warning
prompt - Then we do it on the dependency treeflatProcess to generate the target directory to copy to
dest
- Target after flattening
dest
To sort (usinglocaleCompare
Local collation) - According to flatTree
dest
(the address of the destination directory to copy to),src
(The corresponding packagecache
Directory address) tocopy
The task,package
δ»src
Copy todest
δΈ
yarn
For flattening is actually very simple and rough, according to the firstDo Unicode sorting of the package nameAnd then flatten the layer by layer according to the dependency tree
Q&A
1. How do I increase the number of concurrent network requests?
We can increase the number of concurrent network requests: –network-concurrency
2. What about the total timeout of network requests?
You can set the timeout duration of network requests: –network-timeout
3. Why did I change ityarn.lock
Is the version number of a dependency package still not valid?
"@ Babel/code - the frame @ ^ 7.0.0 - beta. 35":
version "55 7.0.0 - beta."
resolved "Https://registry. ` yarn ` pkg.com/@babel/code-frame/-/code-frame-7.0.0-beta.55.tgz#71f530e7b010af5eb7a7df7752f78921dd57e9e e"
integrity sha1-cfUw57AQr163p993UveJId1X6e4=
dependencies:
"@babel/highlight" "55 7.0.0 - beta."
Copy the code
We randomly intercepted a section of yarn.lock code. It is not enough to only modify the Version and resolved fields, because yarn will also compare the integrity generated based on the actual downloaded content with the integrity field of the yarn.lock file. If not, the download is the wrong dependency package.
4. When different versions of the same dependencies appear in a project dependency, how do I know which one I am actually using?
First we need to look at how dependencies are referenced. Pre-scene:
package.json
The dependence[email protected]
.[email protected]
.[email protected]
[email protected]
Rely on[email protected]
[email protected]
Rely on[email protected]
[email protected]
Rely onC2.0.0
First, based on the current dependency and yarn installation features, the actual installation structure is as follows:
| - [email protected] | - [email protected] | - [email protected] | -- -- -- -- -- [email protected] | - [email protected] | - [email protected]Copy the code
- Develop student direct code references
D
For the actual[email protected]
B
Dependencies are not declared directly in the codeC
“But it’s quoted directlyC
Related object methods (becauseB
Direct referenceD
And,D
I’m sure to quoteC
, soC
There must be). The actual reference is not[email protected]
“But by quote[email protected]
.- because
webpack
Querying dependencies is accessnode_modules
Dependencies that conform to the rule, so it’s referenced directly[email protected]
- because
We can use the YARN list to check whether there is a problem.
This article refer to
- The yarn’s official website
- I added some Chinese comments to fork yarn source code
- Analyze the process of yarn installation dependency from the perspective of source code
β€οΈ Thanks for your support
- If you like it, don’t forget to share, like and collect it.