About the yarn
Yarn and NPM are also JavaScript package management tools. We also find CNPM, PNPM and other package management tools. It is enough for the guarantee engineer to have one.
Why YARN? How is it different from other tools?
Tip: NPM refers to the NPM2 version
And NPM distinction
yarn
The download and installation of dependency packages is multithreaded, whilenpm
Is a single thread of execution, speed on the gap openedyarn
The system caches downloaded dependency packages locally and preferentially reads dependency packages from the cache. Remote requests are made only when the local cache does not exist. In contrast,npm
Is the full request, the speed gap againyarn
Leveling all dependencies at the same level effectively reduces repeated downloads of the same dependencies, speeds up downloads and reduces themnode_modules
The volume of the; In contrast,npm
Is strictly according to the dependency tree download and placed in the corresponding location, resulting in the same package downloaded multiple times,node_modules
The problem of large volume
And CNPM distinction
cnpm
Faster domestic mirroring (other tools can also modify the source address)cnpm
Collect all the packages downloaded by the project into its own cache folder, and place dependent packages in the corresponding project through soft linksnode_modules
In the
And the PNPM difference
- and
yarn
There is also a unified directory for managing dependent packages pnpm
Retain thenpm2
Version of the original dependency tree structure, howevernode_modules
All dependency packages are saved in soft connection mode
Learn about YARN by making a simple YARN
Step 1 – Download
JavaScript package management tools use package.json as an entry point to specify dependencies for a project.
{
"dependencies": {
"lodash": "4.17.20"}}Copy the code
Package. json, for example, can be downloaded directly from package.json.
import fetch from 'node-fetch';
function fetchPackage(packageJson) {
const entries = Object.entries(packageJson.dependencies);
entries.forEach(async ([key, version]) => {
const url = `https://registry.yarnpkg.com/${key}/-/${key}-${version}.tgz`.const response = await fetch(url);
if(! response.ok) {throw new Error(`Couldn't fetch package "${reference}"`);
}
return await response.buffer();
});
}
Copy the code
Now let’s look at another case:
{
"dependencies": {
"lodash": "4.17.20"."customer-package": ".. /.. /customer-package"}}Copy the code
“customer-package”: “.. /.. /customer-package” doesn’t work properly in our code anymore. So we need to do code transformation:
import fetch from 'node-fetch';
import fs from 'fs-extra';
function fetchPackage(packageJson) {
const entries = Object.entries(packageJson.dependencies);
entries.forEach(async ([key, version]) => {
// File path resolution directly copies files
if ([` / `.`. / `.`.. / `].some(prefix= > version.startsWith(prefix))) {
return await fs.readFile(version);
}
// The non-file path requests the remote address directly
/ /... old code
});
}
Copy the code
Step 2 – Flexible matching rules
At present, our code can normally download the fixed version of the dependency package, file path. For example, “react”: “^15.6.0” is not supported, and we know that this expression represents all package versions from 15.6.0 to 15.7.0. Theoretically we should install the latest version of the package in this scope, so we add a new method:
import semver from 'semver';
async function getPinnedReference(name, version) {
// Verify that the version number matches the specification
if(semver.validRange(version) && ! semver.valid(version)) {// Get all versions of dependency packages
const response = await fetch(`https://registry.yarnpkg.com/${name}`);
const info = await response.json();
const versions = Object.keys(info.versions);
// Matches the latest version number of the specification
const maxSatisfying = semver.maxSatisfying(versions, reference);
if (maxSatisfying === null)
throw new Error(
`Couldn't find a version matching "${version}" for package "${name}"`
);
reference = maxSatisfying;
}
return { name, reference };
}
Copy the code
function fetchPackage(packageJson) {
const entries = Object.entries(packageJson.dependencies);
entries.forEach(async ([name, version]) => {
// File path resolution directly copies files
/ /... old code
let realVersion = version;
// Get the latest version of the package if the version number starts with ~ and ^
if (version.startsWith('~') || version.startsWith(A '^')) {
const { reference } = getPinnedReference(name, version);
realVersion = reference;
}
// The non-file path requests the remote address directly
/ /... old code
});
}
Copy the code
Then we can allow users to specify that a package can install the latest package within a dependency scope.
Step 3 – Dependency packages and dependency packages
The reality is not as simple as we think, our dependencies still have their own dependencies, so we need to recurse to each layer of dependencies to download all the dependencies.
// Get the dependencies of the package
async function getPackageDependencies(packageJson) {
const packageBuffer = await fetchPackage(packageJson);
// Read package.json for dependent packages
const packageJson = await readPackageJsonFromArchive(packageBuffer);
const dependencies = packageJson.dependencies || {};
return Object.keys(dependencies).map(name= > {
return { name, version: dependencies[name] };
});
}
Copy the code
Now we can get all the dependency packages in the entire dependency tree from the user project’s package.json.
Step 4 – Transfer files
It’s not enough to be able to download dependencies, we need to move the files to the specified directory, known as node_modules.
async function linkPackages({ name, reference, dependencies }, cwd) {
// Get the entire dependency tree
const dependencyTree = await getPackageDependencyTree({
name,
reference,
dependencies,
});
await Promise.all(
dependencyTree.map(async dependency => {
await linkPackages(dependency, `${cwd}/node_modules/${dependency.name}`); })); }Copy the code
Step 5 – Optimize
Although we can download all the dependency packages according to the entire dependency tree and put them into node_modules, we found that the dependency packages may have duplicate dependencies, resulting in the actual downloaded dependency packages are very redundant, so we can put the same dependency packages in one location, so that there is no need to repeat the download.
function optimizePackageTree({ name, reference, dependencies = [] }) {
dependencies = dependencies.map(dependency= > {
return optimizePackageTree(dependency);
});
for (let hardDependency of dependencies) {
for (let subDependency of hardDependency.dependencies)) {
// Whether the child and parent dependencies have the same dependencies
let availableDependency = dependencies.find(dependency= > {
return dependency.name === subDependency.name;
});
if(! availableDependency) {// Insert the dependency into the parent dependency if the parent dependency does not exist
dependencies.push(subDependency);
}
if (
!availableDependency ||
availableDependency.reference === subDependency.reference
) {
// Remove the same dependency package from the child dependency
hardDependency.dependencies.splice(
hardDependency.dependencies.findIndex(dependency= > {
returndependency.name === subDependency.name; })); }}}return { name, reference, dependencies };
}
Copy the code
We reduce repeated dependency package installations by smoothing out dependencies from one layer to the next recursively.
At this point we have implemented simple YARN
Yarn Architecture
After reading the code, the most intuitive thing to me is that YARN plays the object oriented thought incisively and vividly
- Config:
yarn
Configuration Examples - cliAll:
yarn
Command set instance - registries: indicates the instance of NPM source information
- It involves the lock file, resolution of the dependency package entry file name, dependency package storage location and file name, etc
- lockfile:
yarn.lock
object - Intergrity Checker: Used to check whether dependency packages are correctly downloaded
- package resolver: used to parse
package.json
Depending on how packages are referenced - Package Request: dependent package version request instance
- Package Reference: instance of a dependency package relationship
- Package Fetcher: Dependency package download instance
- Package Linker: Dependency package file management
- Package Hoister: Dependency flattening instance
Yarn Workflow
Flow profile
Here we have yarn Add Lodash as an example to see what yarn does internally. Yarn installs dependency packages in five steps:
- checking: Check configuration items (
.yarnrc
, command line parameters,package.json
Information, etc.), compatibility (cpu
,nodejs
Version, operating system, etc.)package.json
The convention - resolveStep: By parsing the project
package.json
The dependencies form a dependency tree, and the specific version information of all packages in the tree is resolved - FetchStep: Downloads all dependency packages. If the dependency packages already exist in the cache, skip the download. If the dependency packages already exist in the cache, download the dependency packages to the cache folder
- LinkStep: flat copy of cached dependencies (because the packages downloaded in the previous step are in the cache) into the project’s dependencies directory
- BuildStep: For some binary packages, you will need to compile at this step
Process on
Let’s continue with the yarn Add Lodash example
Initialize the
Find the YARnRC file
// Get yarnrc file configuration
// process. CWD Specifies the directory of the current executing command project
Argv Specifies the yarn command and parameters specified by the user
const rc = getRcConfigForCwd(process.cwd(), process.argv.slice(2));
/** * Generate all possible paths * for the Rc file@param {*} Name Indicates the rc source name *@param {*} CWD current project path */
function getRcPaths(name: string, cwd: string) :Array<string> {
/ /... other code
if(! isWin) {// In non-Windows environments, start at /etc/yarn/config
pushConfigPath(etc, name, 'config');
For non-Windows environments, start with /etc/yarnrc
pushConfigPath(etc, `${name}rc`);
}
// A user directory exists
if (home) {
// yarn Sets the default path
pushConfigPath(CONFIG_DIRECTORY);
// user directory /.config/${name}/config
pushConfigPath(home, '.config', name, 'config');
// user directory /.config/${name}/config
pushConfigPath(home, '.config', name);
${name}/config
pushConfigPath(home, `.${name}`.'config');
// User directory /.${name}rc
pushConfigPath(home, `.${name}rc`);
}
${name}rc ${name}rc
// Tip: The rc file actively written by the user has the highest priority
while (true) {
${name}rc
unshiftConfigPath(cwd, `.${name}rc`);
// Get the parent path of the current project
const upperCwd = path.dirname(cwd);
if (upperCwd === cwd) {
// we've reached the root
break;
} else{ cwd = upperCwd; }}/ /... read rc code
}
Copy the code
Parse instructions entered by the user
/** * -- index position */
const doubleDashIndex = process.argv.findIndex(element= > element === The '-');
/** * The first two parameters are node address and YARN file address */
const startArgs = process.argv.slice(0.2);
/** * YARN subcommand & parameter * if exists -- take -- previous part * if not -- take all */
const args = process.argv.slice(2, doubleDashIndex === -1 ? process.argv.length : doubleDashIndex);
/** * Transparent transmission parameters */ of the yarn subcommand
const endArgs = doubleDashIndex === -1 ? [] : process.argv.slice(doubleDashIndex);
Copy the code
Initialize the shared instance
During initialization, the Config configuration item and reporter log are initialized respectively.
config
Will be ininit
, step by step to the parent recursive checkRespecting json. Package
Is it configured?workspace
field- Tip: If the current is
workspace
The project,yarn.lock
Based onworkspace
The root directory ofyarn.lock
Shall prevail
- Tip: If the current is
this.workspaceRootFolder = await this.findWorkspaceRoot(this.cwd);
// yarn.lock directory, which is the same as workspace
this.lockfileFolder = this.workspaceRootFolder || this.cwd;
/** * Find workspace root */
async findWorkspaceRoot(initial: string): Promise<? string> {let previous = null;
let current = path.normalize(initial);
if (!await fs.exists(current)) {
// There is no error in the path
throw new MessageError(this.reporter.lang('folderMissing', current));
}
// Loop step by step through the parent directory to check whether package.json\yarn.json is configured as workspace
// If workspace is configured at any level, the path where the JSON is located is returned
do {
/ / remove the package. The json \ yarn. Json
const manifest = await this.findManifest(current, true);
// Take out the workspace configuration
const ws = extractWorkspaces(manifest);
if (ws && ws.packages) {
const relativePath = path.relative(current, initial);
if (relativePath === ' ' || micromatch([relativePath], ws.packages).length > 0) {
return current;
} else {
return null;
}
}
previous = current;
current = path.dirname(current);
} while(current ! == previous);return null;
}
Copy the code
Execute the add instruction
- We got that from the last step
yarn.lock
Read the addressyarn.lock
File. - According to the
package.json
Life cycle execution inscript
The script
/** * executes */ in order of the lifecycle of the script configuration of package.json
export async function wrapLifecycle(config: Config, flags: Object, factory: () => Promise<void>) :Promise<void> {
/ / preinstall execution
await config.executeLifecycleScript('preinstall');
// Perform the installation
await factory();
/ / install
await config.executeLifecycleScript('install');
/ / execution postinstall
await config.executeLifecycleScript('postinstall');
if(! config.production) {// Non-production environment
if(! config.disablePrepublish) {/ / prepublish execution
await config.executeLifecycleScript('prepublish');
}
Prepare / / execution
await config.executeLifecycleScript('prepare'); }}Copy the code
Getting project dependencies
- First get the current directory
package.json
thedependencies
,devDependencies
,optionalDependencies
All dependent package names + version numbers in- If the current is
workspace
The project reads from the project root directorypackage.json
- Because the current is
workspace
Project, also need to readworkspace
Of all subprojects in the projectpackage.json
Correlation dependency of
- Because the current is
- If the current is
- The fetchRequestFromCwd method of config fetchRequestFromCwd fetchRequestFromCwd fetchRequestFromCwd
// Get all dependencies in the current project directory
pushDeps('dependencies', projectManifestJson, {hint: null.optional: false}, true);
pushDeps('devDependencies', projectManifestJson, {hint: 'dev'.optional: false},!this.config.production);
pushDeps('optionalDependencies', projectManifestJson, {hint: 'optional'.optional: true}, true);
// The current is a workspace project
if (this.config.workspaceRootFolder) {
// Collect package.json for all subprojects under workspace
const workspaces = await this.config.resolveWorkspaces(workspacesRoot, workspaceManifestJson);
for (const workspaceName of Object.keys(workspaces)) {
// Subproject package.json
const workspaceManifest = workspaces[workspaceName].manifest;
// Place subprojects in the root project dependencies dependency
workspaceDependencies[workspaceName] = workspaceManifest.version;
// Collect subproject dependencies
if (this.flags.includeWorkspaceDeps) {
pushDeps('dependencies', workspaceManifest, {hint: null.optional: false}, true);
pushDeps('devDependencies', workspaceManifest, {hint: 'dev'.optional: false},!this.config.production);
pushDeps('optionalDependencies', workspaceManifest, {hint: 'optional'.optional: true}, true); }}}Copy the code
ResolveStep Obtains the dependency package
Now that we have collected all the dependencies for the user project + dependency versions in the previous step, let’s start to get the exact information for these dependencies (which version of dependencies should be downloaded)
- First by calling
package resolver
thefind
Methods bypackage request
Obtain dependent package information, recursive call after obtaining the informationfind
Method to find the value of each dependent packagedependencies
,optionalDependecncies
Depends on package information in. Use one at the same time as parsing the packagefetchingPatterns
theSet<string>
To save dependencies that have been parsed and are being parsed, reducing repeated request operations.- Each dependency package is parsed in detail according to the
Dependent package name + version range
Determines whether it has been resolved currently (i.efetchingPatterns
Does the same string exist in - For unparsed packages, the first step is from
lockfile
To obtain accurate version information, iflockfile
There is information about dependent packages in- judge
lockfile
Is the corresponding version in. If so, inlockfile
To remove information about this dependent package
- judge
- if
lockfile
If the dependency package information does not exist inNPM source
Initiate a request and obtain the requestrange
The highest known version of the dependency package information
- Each dependency package is parsed in detail according to the
- For parsed packets, they are placed on a delay queue
delayedResolveQueue
Is not processed - When all the dependency packages in the dependency tree have been recursively traversed, it traverses again
delayedResolveQueue
In the package information that has been parsed, find the highest version that matches the available version information
After that, we have determined the specific versions of all the dependent packages in the dependency tree, as well as details such as the address of the package.
- Get the latest version number for all layer 1 project dependencies (call
package resolver
thefind
Methods)
/** * Find the dependency package version number */
async find(initialReq: DependencyRequestPattern): Promise<void> {
// Read from cache first
const req = this.resolveToResolution(initialReq);
if(! req) {return;
}
// Rely on package request instances
const request = new PackageRequest(req, this);
const fetchKey = `${req.registry}:${req.pattern}:The ${String(req.optional)}`;
// Determine whether the same dependency package is currently requested
const initialFetch = !this.fetchingPatterns.has(fetchKey);
// Whether to update the yarn.lock flag
let fresh = false;
if (initialFetch) {
// Add cache on first request
this.fetchingPatterns.add(fetchKey);
// Get the dependency package name + version in lockfile
const lockfileEntry = this.lockfile.getLocked(req.pattern);
if (lockfileEntry) {
// Lockfile exists
// Retrieve the dependent version
// eq: concat-stream@^1.5.0 => {name: 'concat-stream', range: '^1.5.0', hasVersion: true}
const {range, hasVersion} = normalizePattern(req.pattern);
if (this.isLockfileEntryOutdated(lockfileEntry.version, range, hasVersion)) {
// Yarn. lock version is outdated
this.reporter.warn(this.reporter.lang('incorrectLockfileEntry', req.pattern));
// Delete the collected dependency version numbers
this.removePattern(req.pattern);
// Delete package version information from yarn.lock (it is outdated and invalid)
this.lockfile.removePattern(req.pattern);
fresh = true; }}else {
fresh = true;
}
request.init();
}
await request.find({fresh, frozen: this.frozen});
}
Copy the code
- Recursively query for information about requested dependencies
for (const depName in info.dependencies) {
const depPattern = depName + The '@' + info.dependencies[depName];
deps.push(depPattern);
promises.push(
this.resolver.find(......),
);
}
for (const depName in info.optionalDependencies) {
const depPattern = depName + The '@' + info.optionalDependencies[depName];
deps.push(depPattern);
promises.push(
this.resolver.find(.......),
);
}
if (remote.type === 'workspace'&&!this.config.production) {
// workspaces support dev dependencies
for (const depName in info.devDependencies) {
const depPattern = depName + The '@' + info.devDependencies[depName];
deps.push(depPattern);
promises.push(
this.resolver.find(.....) ,); }}Copy the code
FetchStep Downloads the dependency package
It mainly downloads dependent packages that are not in the cache.
- First create one for de-duplication
Map<string, PackageReference>
. Iterates through an array of dependencies, each concatenating its own cache directory addressdest
:Cache path + NPM source - package name - version -integrity + node_modules + package name
Through thedest
Do the de-redo operation. - After getting all the dependent packages removed, it will first determine whether the dest cache directory corresponding to each package exists
- If there is one, read the file directly from the cache (depending on how the dependency package is referenced)
- If the dependency package does not exist, download the dependency package based on the reference mode of the dependency package
- because
package reference
For example:NPM source
,Making the source
,Gitlab source
,Address of the file
And so on, soyarn
Depending on thereference
Address calls corresponding tofetcher
Obtaining dependency packages
/** * Splice cache dependent package path * cache path + NPM source - package name - version -integrity + node_modules + package name */
const dest = config.generateModuleCachePath(ref);
export async function fetchOneRemote(remote: PackageRemote, name: string, version: string, dest: string, config: Config,) :Promise<FetchedMetadata> {
if (remote.type === 'link') {
const mockPkg: Manifest = {_uid: ' '.name: ' '.version: '0.0.0'};
return Promise.resolve({resolved: null.hash: ' ', dest, package: mockPkg, cached: false});
}
const Fetcher = fetchers[remote.type];
if(! Fetcher) {throw new MessageError(config.reporter.lang('unknownFetcherFor', remote.type));
}
const fetcher = new Fetcher(dest, remote, config);
// Check whether the file exists based on the address passed in
if (await config.isValidModuleDest(dest)) {
return fetchCache(dest, fetcher, config, remote);
}
// Delete the files in the corresponding path
await fs.unlink(dest);
try {
return await fetcher.fetch({
name,
version,
});
} catch (err) {
try {
await fs.unlink(dest);
} catch (err2) {
// what do?
}
throwerr; }}Copy the code
LinkStep moves files
After fetchStep, we now have all the dependencies in our local cache, and the next step is to copy them to node_modules in our project.
- First of all, parse
peerDependences
If no match is foundpeerDependences
,warning
prompt - The dependency tree is then flattened to generate the target directory to be copied to
- Sort the flattened targets
- According to the
flatTree
In thedest
(destination directory address to copy to),src
(the corresponding cache directory address of the packet)copy
The task,package
fromsrc
Copy todest
Under the
According to the above analysis, A, B and C are the first level dependencies of the project
- A depends on D
- D is promoted because D has never been promoted
- Since the project has been dependent on C, C dependent on A cannot be promoted, otherwise there will be conflicts
- Because D depends on B, but the project already depends on B, so B cannot be promoted
- Because B depends on F, but F has not been promoted, so F has been promoted
- And then B depends on E
- Because E has never been promoted, then E is promoted
Yarn flattening is very simple. It is sorted according to the Unicode of the dependent package name, and then flattened according to the dependency tree layer by layer
Q&A
- How to increase the number of concurrent network requests
–network-concurrency
- What about total network request timeout?
–network-timeout
- Why can’T I change the version number of a dependency package in yarn.lock?
"@ Babel/code - the frame @ ^ 7.0.0 - beta. 35":
version "55 7.0.0 - beta."
resolved "https://registry.yarnpkg.com/@babel/code-frame/-/code-frame-7.0.0-beta.55.tgz#71f530e7b010af5eb7a7df7752f78921dd57e9ee"
integrity sha1-cfUw57AQr163p993UveJId1X6e4=
dependencies:
"@babel/highlight" "55 7.0.0 - beta."
Copy the code
We randomly captured a piece of yarn.lock code. It’s not enough to just change the Version and Resolved fields because YARN will also compare the integrity fields of the yarn.lock file to the integrity fields generated by the downloaded content. If not, it means that this download is the wrong dependency package.
- How do I know which package I’m actually using when a different version of the dependency package appears in a project dependency?
First we’ll look at how dependency packages are referenced.
Pre-scenario:
- In the package. The [email protected]@[email protected]
Based on the current dependency relationship and yarn installation feature, the actual installation structure is as follows:
| - [email protected] | - [email protected] | - [email protected] | -- -- -- -- -- [email protected] | - [email protected] | - [email protected]Copy the code
- Develop student direct code reference D is actually [email protected]
- B code does not declare a dependency on C directly, but refers directly to c-related object methods (because B refers directly to D, and D must refer to C, C must exist). In this case, the actual reference is [email protected]″But by [email protected].
- Because webpack query dependencies are compliant dependencies that access node_modules, we refer directly to [email protected]
We can check to see if there is a problem through the YARN list.
This article refer to
- The yarn’s official website
- Analyze the yarn dependency installation process from the source code
- I fork yarn source code with some Chinese annotations