ES Modules brings the JavaScript official, standard module system to our development with a lot of convenience.

Many JavaScript developers know that ES modules have been controversial, but few people know how ES modules work.

The current stage is basically modular development applications, so we often write the following code in the code, but do we understand why the concept of modularization has been modular principle

import a from 'xxx'
import b from 'xxx'
Copy the code

Today I’m going to talk about what ES Modules solves and how it differs from other CommonJs systems

What does the ES module solve

Think about how Javascript manages variables when you’re writing code, such as copying values to a variable, manipulating variables, and combining two variables to create a new variable.

Because most of our programs manipulate variables, how we organize them has a big impact on how we maintain our code — how we manage our code

We all know that Javascript has some scope. When our applications were small, it was very easy to manage only a small number of variables, because Javascript provides us with function scopes. Functions cannot access variables defined inside other functions. This allows us to only consider local variables inside individual functions.

But function scope also has some disadvantages, it is difficult to share variables between different functions

So how do we really want to share variables outside of scope? Because of the way Javascript looks for variables, it is common practice to put variables in the upper scope. For example, the global function holds variable sharing.

When using JQuery, we must make sure that JQuery already exists in the global scope before loading the JQuery plug-in

So we often hear that we need to put the jQuery import first, otherwise there will be problems, and then the script will have the dependency order, but this order has to be maintained manually, which will cause several problems.

<script src="jQuery" />
<script src="a.js" />
<script src="b.js" />
Copy the code
  1. All scripts need to be placed in a certain order. You need to make sure that no one moves the order. If someone moves the order, the function will throw an error when it looks for jQuery during runtime.

We must be very careful management of our code, we want to such a dependency to delete old code or script became a roulette game, you don’t know what will happen, the mutual dependence between different section of code is invisible, any function can access global scope the content below, so we don’t know which one the function depends on the script.

  1. Since these variables are in the global scope, any script can modify them. If a malicious script modifies the global variables, it is impossible to determine which script is modified

How is the module solved

Modularity gives us a better way to organize variables and functions. With modularity, you can group variables and functions together.

Functions and variables in modules can share variables with each other through import and export, which is different from function scope, which can explicitly provide variables, functions, calss and other modules in the module.

When you want to provide some variables to other modules, it can passexportOnce you use itexportOther modules can explicitly say that they depend on the variable, class, or function.

Because of this explicit dependency, if you remove a module, you can know exactly which other modules will be affected.

Since we can now export and import variables in different modules, it is very easy to think that we can split our code into many independent modules. We can then combine these individual modules (just like Lego) to create different applications from the same series of modules.

Because modules are so useful, it took Javascript nearly a decade to develop a standardized module system, and now we have two module systems available. CommonJS(CJS) is used in Node, and ESM(EcmaScript Modules) is the Javascript specification. ES Modules are now supported by most browsers and later versions of Node.

Next, let’s look at how the ES Module works.

How does the ES module work

When we develop with modules, we usually have in mind a dependency diagram of the module. The relationship between different dependencies comes from importing them using the import declaration

The import statement lets the browser know exactly what code needs to be loaded. You give a file as the top of the Modules graph, and the import statement leads you to a series of other modules.

But the browser doesn’t need the files themselves, it needs to parse the dependent files into Module Records data structures so the browser knows what’s going on in the files

After parsing a file into a Module record Module Records, the Module needs to be converted into a Module instance. An instance contains two main things: the code and state.

The code is a set of instructions. It looks like a recipe, it doesn’t do anything by itself, it needs some raw materials to go with the recipe

State provides raw materials. State is the actual value of all the variables at a point in time. Of course, these variables are just nicknames in memory.

So, module instances combine code (the list of instructions) with state (so the values of variables)

The main steps

Module loading of ESM is mainly divided into three steps, which are briefly summarized here and then expanded one by one

  1. Constuction — find, download and parse all files into the module record
  2. InstantiationFind a block of memory to hold all the exported variables (but no values are filled because they have not been executed), and then make export and import point to these blocks. This process is called linking.live bindings)
  3. Evaluation — Actually executes the code and fills the memory block with the actual value of the variable

We often say ES modules are asynchronous. You should know by now that just because a job is broken down into 3 stages, Constuction, Instantiating, and evaluating, each stage is independent and unrelated. After these three steps we have a Module Graph.

The ES Modules specification does introduce an asynchronism that doesn’t exist in CommonJS. More on that later, in CommonJS, a module and its underlying dependencies are loaded, instantiated, and evaluated all at once without any interruption between the three.

While ES modules are asynchronous, these three steps themselves don’t have to be asynchronous, they can be done synchronously, depending on what you’re loading.

The ES Module specification explains how to parse files into module records and how to instantiate and evaluate the module, but it doesn’t say how to download the module

The download of this module behaves differently on different hosts and, on the browser side, according to the HTML specification.

Let’s go through each step in detail

Constuction

During the construction phase, three things happen to each module

  1. Find: Find out where to download the file containing the module
  2. fetching: Get the file (via url or download it from the file system)
  3. parseParse this file to generate Module Record

Find the file and get (Find and fetching)

The host is responsible for finding the entry file and downloading it. First it needs to find the entry file. In HTML, we use script tags to tell the host where to find the file.

But what if we go looking for the next set of modules? What about the modules that the entry file (main.js) directly depends on?

This is why we introduced import statements in ES Module. The import statement, partly called the Module specifier, tells the loader where to look for the next module

One thing to note about module identifiers: different hosting environments (browsers and Nodes) require different processing, and each host has its own way of interpreting module identifiers. To do this, it uses a module called the module resolution algorithm, which varies from platform to platform. Currently, some module identifiers work on the Node side rather than the browser side. But there are tools we can use to handle it.

Currently, the browser only accepts URLs as module identifiers, and it will download the module via the URL. But loading all modules doesn’t happen at the same time, because we don’t know what dependencies the module needs to get until we parse the file, and we can’t parse the file until we get it

This means that we have to walk through the tree layer by layer, parse a file, find its dependencies, and then find and load those dependencies

Since downloading files takes a long time on the browser side, if the main thread waits for each file to be downloaded, there will be too many tasks blocking the main process. If the main thread is blocked, the whole operation will be very slow. This is one reason why the ES specification divides the loading algorithm into stages. Dividing Constuction into its own phases allows the browser to retrieve the file and parse the module diagram before starting the synchronization of the instantiation.

This division of module algorithms into multiple steps is one of the main differences between ES modules and CommonJS modules

CommonJS parsing modules are different because loading files from the file system is much faster than downloading them over the network. This means that Node can block the main thread while loading a file and instantiate and execute it directly since the file is already loaded (CommonJS does not distinguish this). This means traversing the entire tree, loading and instantiating, and executing all dependencies before returning the module instance

CommonJS is based on this parsing, so in CommonJS we can use variables as module identifiers. Because you are executing all the code for that module (up to the require statement) before looking for the next module. This means that by the time you do module analysis, the variable is already a specific value.

But in ES Modules, you need to build the whole module diagram first, and then execute it. This means that you cannot use any variables as module identifiers because the code is not executed and the value of the variable is not determined.

But sometimes you do want to use variables as module load paths. For example, you may need to switch between loading different modules depending on what the code does or what environment it is running in. To make this possible, ES Module provides dynamic loading (import()). You can use import statements like this import(${path}/foo.js)

parse

Now that we can download and retrieve the file, we need to parse it into a Module Record, which helps the browser understand the differences between modules

Once a Module Record is created, it is put into the Module Map in the module map, which means that whenever requested from there, the loader can directly return the cache in the Map.

There is one thing to note during parsing. All modules with “use strict” at the top will have some differences, for example: the await keyword will remain at the top and this will have undefined

On the browser side, we can specify the type attribute of a script tag as module(type=module). This tells the browser that it should parse the content as a module. Since modules can only import modules, the browser knows that anything imported by import is a module and is also parsed by modules.

On the Node side, however, we can’t use HTML tags, so we can’t specify a type attribute. We can only use the.mjs file extension to tell Node that the file is an ES module file, and we need to use the ES Module parsing mechanism

Either way, the loader will determine whether the file is resolved to a Module, and if it is a Module with an import, it will restart the process until all files are extracted and resolved to a Module Record

This completes the steps of parse, and by the time parsing is complete, we have gone from having a single entry file to using a large number of module records.

The next step is to instantiate the module and link it to the instance

Instantiation,Instantiation)

As mentioned above, an instance consists of a set of instructions (the code) and states. The states are stored in memory, so the instantiation step is to connect everything to memory.

First, the JS engine creates a module’s context (module environment record), it managesmodule recordIt will then find all of the variables in the module contextexportAnd then connect it to a certain space in memory.

Since the code has not yet been executed, there are no values in memory, but all functions will be declared ahead of time to take the pressure off later execution.

In this stage, if the function exported by export is a function, it will be initialized (function has promotion function), which is conducive to the evaluation in the next stage

To instantiate Module Record, the JS engine uses Depth First post-order Traveral for Depth First traversal. The JS engine will create a Module Environment record context for each Module record.

The engine traverses down to the end of the dependency tree without any dependency files, handles the export and memory connections for each module, and then goes back layer by layer to connect all imports for that module

Note that export and import both point to the same region of memory. Connect export first to ensure that all exports can be connected to the corresponding import.

This differs from the CommonJS module, where the entire object export is copied, which means that any exported value (such as a number) is a copy. This means that if the value of a variable in the exported module changes, the imported value will not change.

In contrast to CommonJS, ES modules use Live Bindings, where both modules point to the same location in memory, which means that when a value is changed in the exported module, it can be immediately displayed in the imported module. (A module importing a variable cannot modify the imported value, but can modify its attribute value if it imports an object type.)

ES Modules use Live Bindings because it helps to do static analysis (without executing Code) and avoid problems such as circular dependencies.

This is the usual conclusion: CommonJS module exports are copies of values, and ES Modules are references to values.

Next we start executing the code and populate the actual values into memory.

Evaluation

Remember we wired all the exports and imports through memory, but the memory didn’t have a value yet.

The JS engine adds values to these areas of memory by executing top-level code (code outside of functions).

In addition to populating values in memory, executing code may trigger some side effects, such as a module calling the server.

Because of this potential side effect, we generally only want to execute this module once, as opposed to instantiating a link, where instantiating multiple times results in the same result, while executing code multiple times can produce different results.

This is why you need a Module Map for module mapping, which caches modules through the canonical URL, so there is only one module record per module. This ensures that each module is executed only once. Like instantiation, this is depth-first post-traversal.

It also explains that ES Modules are Modules that implement end dependencies first

Circular dependencies

Many people wonder how modules handle circular dependencies.

In circular dependencies, you have a diagram of a loop, which in real code might be a long loop, but for now we’ll use a short loop example to illustrate the principle.

Let’s take a look at CommonJS first. First, the main.js executes the require() statement at the top, and then it will load the counter.js module.

The counter. Js module will try to fetch the message variable from the exported object, but since the main.js module has not been executed at this point, it will return undefined, and the JS engine will then allocate space in memory for the local variable and set its value to undefined

After executing the top code of counter.js, we want to see if we can get the message value correctly afterwards (after main.js), so we set a setTimeout to get the value.

Now back in main.js, the message variable is initialized and added to memory, but since CommonJS is a copy of the value, there is no connection between the two. In the require(counter

There is a saying in Node, In order to prevent an infinite loop, an unfinished copy of the a.js exports object is returned to the b.js module.

If the export was dynamically bound to Live Bindings, the counter. Js module will eventually get the correct value because main.js is finished executing and filled with the correct value in memory.

Support for circular dependencies is also an important reason behind the design of the ES Module


Refer to articles and photo sources
  1. ES modules: A cartoon deep-dive