About the yarn

Yarn and NPM are also JavaScript package management tools. We also find CNPM, PNPM and other package management tools. It is enough for the guarantee engineer to have one.

Why YARN? How is it different from other tools?

Tip: NPM refers to the NPM2 version

And NPM distinction

  • yarnThe download and installation of dependency packages is multithreaded, whilenpmIs a single thread of execution, speed on the gap opened
  • yarnThe system caches downloaded dependency packages locally and preferentially reads dependency packages from the cache. Remote requests are made only when the local cache does not exist. In contrast,npmIs the full request, the speed gap again
  • yarnLeveling all dependencies at the same level effectively reduces repeated downloads of the same dependencies, speeds up downloads and reduces themnode_modulesThe volume of the; In contrast,npmIs strictly according to the dependency tree download and placed in the corresponding location, resulting in the same package downloaded multiple times,node_modulesThe problem of large volume

And CNPM distinction

  • cnpmFaster domestic mirroring (other tools can also modify the source address)
  • cnpmCollect all the packages downloaded by the project into its own cache folder, and place dependent packages in the corresponding project through soft linksnode_modulesIn the

And the PNPM difference

  • andyarnThere is also a unified directory for managing dependent packages
  • pnpmRetain thenpm2Version of the original dependency tree structure, howevernode_modulesAll dependency packages are saved in soft connection mode

Learn about YARN by making a simple YARN

Step 1 – Download

JavaScript package management tools use package.json as an entry point to specify dependencies for a project.

{
    "dependencies": {
        "lodash": "4.17.20"}}Copy the code

Package. json, for example, can be downloaded directly from package.json.

import fetch from 'node-fetch';

function fetchPackage(packageJson) {
  const entries = Object.entries(packageJson.dependencies);
  entries.forEach(async ([key, version]) => {
    const url = `https://registry.yarnpkg.com/${key}/-/${key}-${version}.tgz`.const response = await fetch(url);
    
    if(! response.ok) {throw new Error(`Couldn't fetch package "${reference}"`);
    }
    
    return await response.buffer();
  });
}
Copy the code

Now let’s look at another case:

{
    "dependencies": {
        "lodash": "4.17.20"."customer-package": ".. /.. /customer-package"}}Copy the code

“customer-package”: “.. /.. /customer-package” doesn’t work properly in our code anymore. So we need to do code transformation:

import fetch from 'node-fetch';
import fs from 'fs-extra';

function fetchPackage(packageJson) {
  const entries = Object.entries(packageJson.dependencies);
  entries.forEach(async ([key, version]) => {
    // File path resolution directly copies files
    if ([` / `.`. / `.`.. / `].some(prefix= > version.startsWith(prefix))) {
      return await fs.readFile(version);
    }
    
    // The non-file path requests the remote address directly
    / /... old code
  });
} 
Copy the code

Step 2 – Flexible matching rules

At present, our code can normally download the fixed version of the dependency package, file path. For example, “react”: “^15.6.0” is not supported, and we know that this expression represents all package versions from 15.6.0 to 15.7.0. Theoretically we should install the latest version of the package in this scope, so we add a new method:

import semver from 'semver';

async function getPinnedReference(name, version) {
  // Verify that the version number matches the specification
  if(semver.validRange(version) && ! semver.valid(version)) {// Get all versions of dependency packages
    const response = await fetch(`https://registry.yarnpkg.com/${name}`);
    const info = await response.json();
    const versions = Object.keys(info.versions);
    // Matches the latest version number of the specification
    const maxSatisfying = semver.maxSatisfying(versions, reference);

    if (maxSatisfying === null)
      throw new Error(
        `Couldn't find a version matching "${version}" for package "${name}"`
      );

    reference = maxSatisfying;
  }

  return { name, reference };
}
Copy the code
function fetchPackage(packageJson) {
  const entries = Object.entries(packageJson.dependencies);
  
  entries.forEach(async ([name, version]) => {
    // File path resolution directly copies files
    / /... old code
    
    let realVersion = version;
    // Get the latest version of the package if the version number starts with ~ and ^
    if (version.startsWith('~') || version.startsWith(A '^')) {
      const { reference } = getPinnedReference(name, version);
      realVersion = reference;
    }
    
    // The non-file path requests the remote address directly
    / /... old code
  });
}
Copy the code

Then we can allow users to specify that a package can install the latest package within a dependency scope.

Step 3 – Dependency packages and dependency packages

The reality is not as simple as we think, our dependencies still have their own dependencies, so we need to recurse to each layer of dependencies to download all the dependencies.

// Get the dependencies of the package
async function getPackageDependencies(packageJson) {
  const packageBuffer = await fetchPackage(packageJson);
  // Read package.json for dependent packages
  const packageJson = await readPackageJsonFromArchive(packageBuffer);
  const dependencies = packageJson.dependencies || {};
  return Object.keys(dependencies).map(name= > {
    return { name, version: dependencies[name] };
  });
}
Copy the code

Now we can get all the dependency packages in the entire dependency tree from the user project’s package.json.

Step 4 – Transfer files

It’s not enough to be able to download dependencies, we need to move the files to the specified directory, known as node_modules.

async function linkPackages({ name, reference, dependencies }, cwd) {
  // Get the entire dependency tree
  const dependencyTree = await getPackageDependencyTree({
    name,
    reference,
    dependencies,
  });

  await Promise.all(
    dependencyTree.map(async dependency => {
      await linkPackages(dependency, `${cwd}/node_modules/${dependency.name}`); })); }Copy the code

Step 5 – Optimize

Although we can download all the dependency packages according to the entire dependency tree and put them into node_modules, we found that the dependency packages may have duplicate dependencies, resulting in the actual downloaded dependency packages are very redundant, so we can put the same dependency packages in one location, so that there is no need to repeat the download.

function optimizePackageTree({ name, reference, dependencies = [] }) {
  dependencies = dependencies.map(dependency= > {
    return optimizePackageTree(dependency);
  });

  for (let hardDependency of dependencies) {
    for (let subDependency of hardDependency.dependencies)) {
      // Whether the child and parent dependencies have the same dependencies
      let availableDependency = dependencies.find(dependency= > {
        return dependency.name === subDependency.name;
      });

      if(! availableDependency) {// Insert the dependency into the parent dependency if the parent dependency does not exist
          dependencies.push(subDependency);
      }
      
      if (
        !availableDependency ||
        availableDependency.reference === subDependency.reference
      ) {
        // Remove the same dependency package from the child dependency
        hardDependency.dependencies.splice(
          hardDependency.dependencies.findIndex(dependency= > {
            returndependency.name === subDependency.name; })); }}}return { name, reference, dependencies };
}
Copy the code

We reduce repeated dependency package installations by smoothing out dependencies from one layer to the next recursively.

At this point we have implemented simple YARN

Yarn Architecture

After reading the code, the most intuitive thing to me is that YARN plays the object oriented thought incisively and vividly

  • Config:yarnConfiguration Examples
  • cliAll:yarnCommand set instance
  • registries: indicates the instance of NPM source information
    • It involves the lock file, resolution of the dependency package entry file name, dependency package storage location and file name, etc
  • lockfile:yarn.lockobject
  • Intergrity Checker: Used to check whether dependency packages are correctly downloaded
  • package resolver: used to parsepackage.jsonDepending on how packages are referenced
  • Package Request: dependent package version request instance
  • Package Reference: instance of a dependency package relationship
  • Package Fetcher: Dependency package download instance
  • Package Linker: Dependency package file management
  • Package Hoister: Dependency flattening instance

Yarn Workflow

Flow profile

Here we have yarn Add Lodash as an example to see what yarn does internally. Yarn installs dependency packages in five steps:

  • checking: Check configuration items (.yarnrc, command line parameters,package.jsonInformation, etc.), compatibility (cpu,nodejsVersion, operating system, etc.)package.jsonThe convention
  • resolveStep: By parsing the projectpackage.jsonThe dependencies form a dependency tree, and the specific version information of all packages in the tree is resolved
  • FetchStep: Downloads all dependency packages. If the dependency packages already exist in the cache, skip the download. If the dependency packages already exist in the cache, download the dependency packages to the cache folder
  • LinkStep: flat copy of cached dependencies (because the packages downloaded in the previous step are in the cache) into the project’s dependencies directory
  • BuildStep: For some binary packages, you will need to compile at this step

Process on

Let’s continue with the yarn Add Lodash example

Initialize the

Find the YARnRC file

// Get yarnrc file configuration
// process. CWD Specifies the directory of the current executing command project
Argv Specifies the yarn command and parameters specified by the user
const rc = getRcConfigForCwd(process.cwd(), process.argv.slice(2));

/** * Generate all possible paths * for the Rc file@param {*} Name Indicates the rc source name *@param {*} CWD current project path */
function getRcPaths(name: string, cwd: string) :Array<string> {
  / /... other code

  if(! isWin) {// In non-Windows environments, start at /etc/yarn/config
    pushConfigPath(etc, name, 'config');
    For non-Windows environments, start with /etc/yarnrc
    pushConfigPath(etc, `${name}rc`);
  }
  // A user directory exists
  if (home) {
    // yarn Sets the default path
    pushConfigPath(CONFIG_DIRECTORY);
    // user directory /.config/${name}/config
    pushConfigPath(home, '.config', name, 'config');
    // user directory /.config/${name}/config
    pushConfigPath(home, '.config', name);
    ${name}/config
    pushConfigPath(home, `.${name}`.'config');
    // User directory /.${name}rc
    pushConfigPath(home, `.${name}rc`);
  }
  ${name}rc ${name}rc
  // Tip: The rc file actively written by the user has the highest priority
  while (true) {
    ${name}rc
    unshiftConfigPath(cwd, `.${name}rc`);
    // Get the parent path of the current project
    const upperCwd = path.dirname(cwd);
    if (upperCwd === cwd) {
      // we've reached the root
      break;
    } else{ cwd = upperCwd; }}/ /... read rc code
}
Copy the code

Parse instructions entered by the user

/** * -- index position */
const doubleDashIndex = process.argv.findIndex(element= > element === The '-');
/** * The first two parameters are node address and YARN file address */
const startArgs = process.argv.slice(0.2);
/** * YARN subcommand & parameter * if exists -- take -- previous part * if not -- take all */
const args = process.argv.slice(2, doubleDashIndex === -1 ? process.argv.length : doubleDashIndex);
/** * Transparent transmission parameters */ of the yarn subcommand
const endArgs = doubleDashIndex === -1 ? [] : process.argv.slice(doubleDashIndex);
Copy the code

Initialize the shared instance

During initialization, the Config configuration item and reporter log are initialized respectively.

  • configWill be ininit, step by step to the parent recursive checkRespecting json. PackageIs it configured?workspacefield
    • Tip: If the current isworkspaceThe project,yarn.lockBased onworkspaceThe root directory ofyarn.lockShall prevail
this.workspaceRootFolder = await this.findWorkspaceRoot(this.cwd);
// yarn.lock directory, which is the same as workspace
this.lockfileFolder = this.workspaceRootFolder || this.cwd;

/** * Find workspace root */
async findWorkspaceRoot(initial: string): Promise<? string> {let previous = null;
    let current = path.normalize(initial);
    if (!await fs.exists(current)) {
      // There is no error in the path
      throw new MessageError(this.reporter.lang('folderMissing', current));
    }

    // Loop step by step through the parent directory to check whether package.json\yarn.json is configured as workspace
    // If workspace is configured at any level, the path where the JSON is located is returned
    do {
      / / remove the package. The json \ yarn. Json
      const manifest = await this.findManifest(current, true);

      // Take out the workspace configuration
      const ws = extractWorkspaces(manifest);
      if (ws && ws.packages) {
        const relativePath = path.relative(current, initial);
        if (relativePath === ' ' || micromatch([relativePath], ws.packages).length > 0) {
          return current;
        } else {
          return null;
        }
      }

      previous = current;
      current = path.dirname(current);
    } while(current ! == previous);return null;
}
Copy the code

Execute the add instruction

  • We got that from the last stepyarn.lockRead the addressyarn.lockFile.
  • According to thepackage.jsonLife cycle execution inscriptThe script
/** * executes */ in order of the lifecycle of the script configuration of package.json
export async function wrapLifecycle(config: Config, flags: Object, factory: () => Promise<void>) :Promise<void> {
  / / preinstall execution
  await config.executeLifecycleScript('preinstall');
  // Perform the installation
  await factory();
  / / install
  await config.executeLifecycleScript('install');
  / / execution postinstall
  await config.executeLifecycleScript('postinstall');
  if(! config.production) {// Non-production environment
    if(! config.disablePrepublish) {/ / prepublish execution
      await config.executeLifecycleScript('prepublish');
    }
    Prepare / / execution
    await config.executeLifecycleScript('prepare'); }}Copy the code

Getting project dependencies

  • First get the current directorypackage.jsonthedependencies,devDependencies,optionalDependenciesAll dependent package names + version numbers in
    • If the current isworkspaceThe project reads from the project root directorypackage.json
      • Because the current isworkspaceProject, also need to readworkspaceOf all subprojects in the projectpackage.jsonCorrelation dependency of
  • The fetchRequestFromCwd method of config fetchRequestFromCwd fetchRequestFromCwd fetchRequestFromCwd
// Get all dependencies in the current project directory
pushDeps('dependencies', projectManifestJson, {hint: null.optional: false}, true);
pushDeps('devDependencies', projectManifestJson, {hint: 'dev'.optional: false},!this.config.production);
pushDeps('optionalDependencies', projectManifestJson, {hint: 'optional'.optional: true}, true);

// The current is a workspace project
if (this.config.workspaceRootFolder) {
    // Collect package.json for all subprojects under workspace
    const workspaces = await this.config.resolveWorkspaces(workspacesRoot, workspaceManifestJson);
    for (const workspaceName of Object.keys(workspaces)) {
          // Subproject package.json
          const workspaceManifest = workspaces[workspaceName].manifest;
          // Place subprojects in the root project dependencies dependency
          workspaceDependencies[workspaceName] = workspaceManifest.version;
          // Collect subproject dependencies
          if (this.flags.includeWorkspaceDeps) {
            pushDeps('dependencies', workspaceManifest, {hint: null.optional: false}, true);
            pushDeps('devDependencies', workspaceManifest, {hint: 'dev'.optional: false},!this.config.production);
            pushDeps('optionalDependencies', workspaceManifest, {hint: 'optional'.optional: true}, true); }}}Copy the code

ResolveStep Obtains the dependency package

Now that we have collected all the dependencies for the user project + dependency versions in the previous step, let’s start to get the exact information for these dependencies (which version of dependencies should be downloaded)

  • First by callingpackage resolverthefindMethods bypackage requestObtain dependent package information, recursive call after obtaining the informationfindMethod to find the value of each dependent packagedependencies,optionalDependecnciesDepends on package information in. Use one at the same time as parsing the packagefetchingPatternstheSet<string>To save dependencies that have been parsed and are being parsed, reducing repeated request operations.
    • Each dependency package is parsed in detail according to theDependent package name + version rangeDetermines whether it has been resolved currently (i.efetchingPatternsDoes the same string exist in
    • For unparsed packages, the first step is fromlockfileTo obtain accurate version information, iflockfileThere is information about dependent packages in
      • judgelockfileIs the corresponding version in. If so, inlockfileTo remove information about this dependent package
    • iflockfileIf the dependency package information does not exist inNPM sourceInitiate a request and obtain the requestrangeThe highest known version of the dependency package information
  • For parsed packets, they are placed on a delay queuedelayedResolveQueueIs not processed
  • When all the dependency packages in the dependency tree have been recursively traversed, it traverses againdelayedResolveQueueIn the package information that has been parsed, find the highest version that matches the available version information

After that, we have determined the specific versions of all the dependent packages in the dependency tree, as well as details such as the address of the package.

  • Get the latest version number for all layer 1 project dependencies (callpackage resolverthefindMethods)
/** * Find the dependency package version number */ 
async find(initialReq: DependencyRequestPattern): Promise<void> {
    // Read from cache first
    const req = this.resolveToResolution(initialReq);
    if(! req) {return;
    }

    // Rely on package request instances
    const request = new PackageRequest(req, this);
    const fetchKey = `${req.registry}:${req.pattern}:The ${String(req.optional)}`;
    // Determine whether the same dependency package is currently requested
    const initialFetch = !this.fetchingPatterns.has(fetchKey);
    // Whether to update the yarn.lock flag
    let fresh = false;
    
    if (initialFetch) {
      // Add cache on first request
      this.fetchingPatterns.add(fetchKey);
      // Get the dependency package name + version in lockfile
      const lockfileEntry = this.lockfile.getLocked(req.pattern);
      if (lockfileEntry) {
        // Lockfile exists
        // Retrieve the dependent version
        // eq: concat-stream@^1.5.0 => {name: 'concat-stream', range: '^1.5.0', hasVersion: true}
        const {range, hasVersion} = normalizePattern(req.pattern);
        if (this.isLockfileEntryOutdated(lockfileEntry.version, range, hasVersion)) {
          // Yarn. lock version is outdated
          this.reporter.warn(this.reporter.lang('incorrectLockfileEntry', req.pattern));
          // Delete the collected dependency version numbers
          this.removePattern(req.pattern);
          // Delete package version information from yarn.lock (it is outdated and invalid)
          this.lockfile.removePattern(req.pattern);
          fresh = true; }}else {
        fresh = true;
      }
      request.init();
    }

    await request.find({fresh, frozen: this.frozen});
}
Copy the code
  • Recursively query for information about requested dependencies
for (const depName in info.dependencies) {
      const depPattern = depName + The '@' + info.dependencies[depName];
      deps.push(depPattern);
      promises.push(
        this.resolver.find(......),
      );
}
for (const depName in info.optionalDependencies) {
      const depPattern = depName + The '@' + info.optionalDependencies[depName];
      deps.push(depPattern);
      promises.push(
        this.resolver.find(.......),
      );
}
if (remote.type === 'workspace'&&!this.config.production) {
      // workspaces support dev dependencies
      for (const depName in info.devDependencies) {
            const depPattern = depName + The '@' + info.devDependencies[depName];
            deps.push(depPattern);
            promises.push(
              this.resolver.find(.....) ,); }}Copy the code

FetchStep Downloads the dependency package

It mainly downloads dependent packages that are not in the cache.

  • First create one for de-duplicationMap<string, PackageReference>. Iterates through an array of dependencies, each concatenating its own cache directory addressdest:Cache path + NPM source - package name - version -integrity + node_modules + package nameThrough thedestDo the de-redo operation.
  • After getting all the dependent packages removed, it will first determine whether the dest cache directory corresponding to each package exists
    • If there is one, read the file directly from the cache (depending on how the dependency package is referenced)
    • If the dependency package does not exist, download the dependency package based on the reference mode of the dependency package
  • becausepackage referenceFor example:NPM source,Making the source,Gitlab source,Address of the fileAnd so on, soyarnDepending on thereferenceAddress calls corresponding tofetcherObtaining dependency packages
/** * Splice cache dependent package path * cache path + NPM source - package name - version -integrity + node_modules + package name */
const dest = config.generateModuleCachePath(ref);

export async function fetchOneRemote(remote: PackageRemote, name: string, version: string, dest: string, config: Config,) :Promise<FetchedMetadata> {
  if (remote.type === 'link') {
    const mockPkg: Manifest = {_uid: ' '.name: ' '.version: '0.0.0'};
    return Promise.resolve({resolved: null.hash: ' ', dest, package: mockPkg, cached: false});
  }

  const Fetcher = fetchers[remote.type];
  if(! Fetcher) {throw new MessageError(config.reporter.lang('unknownFetcherFor', remote.type));
  }

  const fetcher = new Fetcher(dest, remote, config);
  // Check whether the file exists based on the address passed in
  if (await config.isValidModuleDest(dest)) {
    return fetchCache(dest, fetcher, config, remote);
  }
  // Delete the files in the corresponding path
  await fs.unlink(dest);

  try {
    return await fetcher.fetch({
      name,
      version,
    });
  } catch (err) {
    try {
      await fs.unlink(dest);
    } catch (err2) {
      // what do?
    }
    throwerr; }}Copy the code

LinkStep moves files

After fetchStep, we now have all the dependencies in our local cache, and the next step is to copy them to node_modules in our project.

  • First of all, parsepeerDependencesIf no match is foundpeerDependences,warningprompt
  • The dependency tree is then flattened to generate the target directory to be copied to
  • Sort the flattened targets
  • According to theflatTreeIn thedest(destination directory address to copy to),src(the corresponding cache directory address of the packet)copyThe task,packagefromsrcCopy todestUnder the

According to the above analysis, A, B and C are the first level dependencies of the project

  • A depends on D
    • D is promoted because D has never been promoted
    • Since the project has been dependent on C, C dependent on A cannot be promoted, otherwise there will be conflicts
    • Because D depends on B, but the project already depends on B, so B cannot be promoted
    • Because B depends on F, but F has not been promoted, so F has been promoted
  • And then B depends on E
    • Because E has never been promoted, then E is promoted

Yarn flattening is very simple. It is sorted according to the Unicode of the dependent package name, and then flattened according to the dependency tree layer by layer

Q&A

  • How to increase the number of concurrent network requests

–network-concurrency

  • What about total network request timeout?

–network-timeout

  • Why can’T I change the version number of a dependency package in yarn.lock?
"@ Babel/code - the frame @ ^ 7.0.0 - beta. 35": 
  version "55 7.0.0 - beta." 
  resolved "https://registry.yarnpkg.com/@babel/code-frame/-/code-frame-7.0.0-beta.55.tgz#71f530e7b010af5eb7a7df7752f78921dd57e9ee" 
  integrity sha1-cfUw57AQr163p993UveJId1X6e4= 
  dependencies: 
    "@babel/highlight" "55 7.0.0 - beta." 
Copy the code

We randomly captured a piece of yarn.lock code. It’s not enough to just change the Version and Resolved fields because YARN will also compare the integrity fields of the yarn.lock file to the integrity fields generated by the downloaded content. If not, it means that this download is the wrong dependency package.

  • How do I know which package I’m actually using when a different version of the dependency package appears in a project dependency?

First we’ll look at how dependency packages are referenced.

Pre-scenario:

Based on the current dependency relationship and yarn installation feature, the actual installation structure is as follows:

| - [email protected] | - [email protected] | - [email protected] | -- -- -- -- -- [email protected] | - [email protected] | - [email protected]Copy the code
  • Develop student direct code reference D is actually [email protected]
  • B code does not declare a dependency on C directly, but refers directly to c-related object methods (because B refers directly to D, and D must refer to C, C must exist). In this case, the actual reference is [email protected]″But by [email protected].
    • Because webpack query dependencies are compliant dependencies that access node_modules, we refer directly to [email protected]

We can check to see if there is a problem through the YARN list.

This article refer to

  • The yarn’s official website
  • Analyze the yarn dependency installation process from the source code
  • I fork yarn source code with some Chinese annotations