This article is shared by Zhaolang, the front-end team of Amoy Node architecture

Even if we are familiar with JavaScript, the length of the ECMAScript specification (nearly 600,000 words) is not an easy task to read, and the length of the content that takes a long time to load when a browser/reader opens for the first time will deter many people from reading. It’s also hard to keep the motivation going. So why do we need to read the ECMAScript specification?

We use ECMAScript to refer to the ECMAScript Language Specification compiled by the Ecma International Technical Committee 39. JavaScript is used to refer to the common programming language we use every day.

Why do we need to read the ECMAScript specification?

The ECMAScript specification is the standard for implementing the behavior of many browsers and JavaScript engines in Node.js. To understand the specific behavior of JavaScript, we need to read the steps defined by the ECMAScript specification to understand the details. For example, we sometimes use the built-in array. prototype methods to make things easier, but there are also some confusing things like:

> Array.prototype.push('foo')
1
> Array.isArray(Array.prototype)
true
> Set.prototype.add('foo')
Uncaught TypeError: Method Set.prototype.add called on incompatible receiver #<Set>  at Set.add (<anonymous>) Copy the code

If we run into this problem in practice, the most likely contributor Google may not be able to help, and Stackoverflow may not be able to help, one of the most relevant sources of documentation is the ECMAScript specification text. It contains detailed steps for the array.prototype. push operation and the Array Prototype Object property definition, so you can see exactly what happened in the above code and why the behavior ended up the way it did. Or, if we want to understand the specific difference between == and ===, we can read the bland description on MDN (we may not be able to figure out exactly what the difference is), but by reading the ECMAScript specification text that defines the two operators, For example, in the runtime semantics section of the == operator, we know that the == operator is the result of the Abstract Equality Comparison operation, and we can continue to look up the problem and see what’s going on. What exactly does in each operator causes the behavior of the == operator we currently use to differ from ===, and may explain why the normal code Lint rule disallows the == operator but generally allows the == null exception. The standardization of the specification and test262, the ECMAScript specification’s test set, enable the same JavaScript code to get the same expected results in different JavaScript runtime environments, Also, the semantic details of the JavaScript language are recorded in detail in the specification using standard text to avoid ambiguity, which makes it easier for us to understand the behavior of JavaScript.

What does the ECMAScript specification cover?

If someone asks us what JavaScript is, the answer, “JavaScript is a programming language made up of JavaScript features,” doesn’t dispel the questioner’s doubts. What parts of JavaScript should be included in ECMAScript? What features can be attributed to JavaScript? Does the length of ECMAScript include definitions of seemingly JavaScript language features that we encounter in everyday use of JavaScript? JavaScript is one of the few languages that strictly distinguishes between language features and the capabilities of the host environment. There are now many different well-known host environments such as browsers, servers, and embedded devices that provide computing and data manipulation environments using different embedded JavaScript implementations. Designed as an object-oriented language for performing computations and manipulating data in a host environment, the ECMAScript specification does not aim for a fully self-contained language design. There is no definition of input to external data or output of computations. Each JavaScript host environment usually provides JavaScript with the ability to access the host environment API and I/O by adding objects that can be accessed globally, such as Document, XMLHttpRequest, Process, require, etc. The definition of these behaviors is not covered by the ECMAScript specification, which states that the host environment can provide certain features that can be accessed and used by JavaScript programs.

features ECMAScript specification category
Syntax definitions for syntax elements, such as how to write a for-in loop that complies with the specification
The semantic definition of a syntactic element, e.gtypeofOperator definition, statement{ foo: 'bar' }The return value of the
Object.Array.ProxyAnd so on for built-in object method routines
import a from 'a.mjs' ⭕️, note 1
console.setTimeout.clearTimeout ❌, note 2
Buffer.process.global ❌, note 3
module.exports.require().__filename.__dirname ❌, note 4
window.alert.XMLHttpRequest.documentAnd so on DOM objects ❌, note 5

Note 1: The ECMAScript specification defines the syntax and meaning of ECMAScript modules, but does not include how a JavaScript runtime should load these dependent modules. ECMAScript modules such as Node.js and ECMAScript modules provided in browser environments have different Module parsing algorithms. Note 2 These methods exist in both the browser and Node.js environments, but these methods and their IO operations and semantics are provided by the JavaScript runtime environment, not defined in the ECMAScript specification. Note 3 These methods and objects and their IO operations and semantics are provided by the Node.js environment, not defined in the ECMAScript specification. Note 4 The Node.js environment provides objects and methods defined by these CommonJS modules, rather than those defined in the ECMAScript specification. Note 5 These browsers and DOM apis and the IO operations and semantics they provide are specifications defined by the W3C working group, not in the ECMASCript specification.

How do I obtain the ECMAScript specification?

The latest ECMAScript specification is available at TC39.es/ECMA262. ECMAScript 6 is the first edition of Ecma TC39 to adopt an annual update to the ECMAScript standard writing rhythm after ECMAScript 5 standardized the fact standard of JavaScript through standard text. The ECMAScript specification is available in plain text at github.com/tc39/ecma26… At the same time, the standard text compilation, such as problem fixes, editorial fixes (such as word problems) and other optimization, improvement, through the open Pull Request for standard revision process. TC39 will archive the ECMAScript specification at a certain point each year, and mark a version number to be released as the ECMAScript language standard of that year. ECMAScript® 2019 Language Specification (ECMA-262, 10th Edition) (or ES10, ES2019) is the ECMA 262 standard text available in Tc39.es/ECMA262 as of June 2019, with certain packaging or PDF formatting for permanent preservation.

Specification tour

The content of the ECMAScript specification can be broken down into the following sections:

  • Conventions and basics: what is the definition of Number in ECMAScript, for example, orthrow a TypeError exceptionWhat the statement stands for;
  • Language syntax generation: for example, how to write a for-in loop that conforms to the specification;
  • Language static semantics: as oneVariableDeclarationHow to determine a variable declaration;
  • Language runtime semantics: such as the definition of an execution routine for a for-in loop;
  • APIs:String.prototype.substringAnd other built-in object method routine definition.

Of course, the ECMAScript specification is not written in this order, but is often interlaced in different sections. Because the ECMAScript specification is so long, no one will read it from top to bottom, and most people won’t have to, and it won’t work as well. We can go back to the ECMAScript specification and look for the definition text when we encounter a specific problem in our everyday JavaScript writing process, think about which part of the above should the problem be. If we find it difficult to determine what part of the problem is in the process, consider “at what stage is this (problem) being implemented?” , making it easier to identify paragraphs in the ECMAScript specification. Let’s drill down into the ECMAScript specification text with a question to see how the JavaScript we use in our daily work performs as defined by the specification text.

Property access for primitive types

Member properties on JavaScript objects are well known by traversing the prototype chain, such as ({}).hasownProperty, where the object literal does not define hasOwnProperty members, But since the default prototype for Object literals is Object.prototype, the hasOwnProperty member on Object.prototype is also accessible from the Object literal. We often access properties of primitive type values, but where are those properties defined? Is there a stereotype definition for a basic type?

'foobar'.substring(3);
// -> 'bar'
Copy the code

Note: There are many direct references to the ECMAScript specification as of March 10, 2020, after which the latest version of the ECMAScript specification may be updated. You can refer to the latest version of the ECMAScript specification when reading it.

Where defines the syntax for member attribute access?

As you can see from the grammatical generators of member expressions, there are seven possible generators of member expressions. A MemberExpression can be a single PrimaryExpression, or a MemberExpression with an Expression wrapped in square brackets: MemberExpression [Expression], such as obj[‘foo’].

MemberExpression:
    PrimaryExpression
    MemberExpression [ Expression ]
    MemberExpression . IdentifierName
    MemberExpression TemplateLiteral
    SuperProperty
    MetaProperty
    new MemberExpression Arguments
Copy the code

And ‘foobar’. Substring is the grammar expressed by MemberExpression. IdentifierName. Back to our question, “How are attributes of primitive types accessed?” Property access takes place at runtime, so we can first look at the runtime semantics of this member expression.

Learn more context-free grammar: en.wikipedia.org/wiki/Contex…

Runtime semantics of member expressions

The runtime semantics of the syntax define how the semantics are expressed at run time by the definition of the syntax, For example, the runtime semantics of member expressions define how an expression such as foo.bar, the above memberExpression. IdentifierName generator, fetters the value of its member property bar on foo at runtime. The runtime semantics in most ECMAScript specifications consist of a series of algorithmatic steps, but unlike regular pseudocode, the steps are described in a more precise way.

MemberExpression : MemberExpression . IdentifierName
1. Let baseReference be the result of evaluating MemberExpression.
2. Let baseValue be ? GetValue ( baseReference ).
3. If the code matched by this MemberExpression is strict mode code , let strict be true; else let strict be false.
4. Return ? EvaluatePropertyAccessWithIdentifierKey ( baseValue , IdentifierName , strict ).
Copy the code

Can see step 4 in operation will be more concrete operating agent gave another abstract EvaluatePropertyAccessWithIdentifierKey operation:

EvaluatePropertyAccessWithIdentifierKey ( baseValue, identifierName, strict )
1. Assert: identifierName is an IdentifierName.
2. Let bv be ? RequireObjectCoercible ( baseValue ).
3. Let propertyNameString be StringValue of identifierName.
4. Return a value of type Reference whose base value component is bv , whose referenced name component is propertyNameString, and whose strict reference flag is strict.
Copy the code

This algorithm returns a reference type and does nothing to the object. How does this attribute reference type get converted to a specific value? Going back to our example, we can see that in addition to the ‘foobar’.startswith attribute call, there is another function call in the code:

'foobar'.startsWith('foo');
Copy the code

Reference types are often used in features like DELETE, Typeof, assignment operations, the super keyword, and so on. For example, obj.foo = ‘bar’ on the left of the assignment is a reference type, and only in the final assignment is the reference really represented. The reference type contains the resolved name or property binding. A single reference contains three parts: reference base, reference name, and whether it is a strict reference flag. The reference base will usually be undefined, an object, a Boolean, a string, Symbol, number, BigInt, or an Environment Record. If the base is undefined, the reference cannot be resolved. The reference name will be a string or Symbol, which is the key type we can use in JavaScript.

So let’s move on to the runtime semantics represented by the following call expression.

CallExpression : CoverCallExpressionAndAsyncArrowHead
1. Let expr be CoveredCallExpression of CoverCallExpressionAndAsyncArrowHead.
2. Let memberExpr be the MemberExpression of expr.
3. Let arguments be the Arguments of expr.
4. Let ref be the result of evaluating memberExpr.
5. Let func be ? GetValue(ref).
6. If Type(ref) is Reference, IsPropertyReference(ref) is false, and GetReferencedName(ref) is "eval", then
    a. If SameValue(func, %eval%) is true, then
        i. Let argList be ? ArgumentListEvaluation of arguments.
        ii. If argList has no elements, return undefined.
        iii. Let evalArg be the first element of argList.
        iv. If the source code matching this CallExpression is strict mode code, let strictCaller be true. Otherwise let strictCaller be false.
        v. Let evalRealm be the current Realm Record.
        vi. Return ? PerformEval(evalArg, evalRealm, strictCaller, true).
7. Let thisCall be this CallExpression.
8. Let tailCall be IsInTailPosition(thisCall).
9. Return ? EvaluateCall(func, ref, arguments, tailCall).
Copy the code

In the runtime semantic abstraction operation of the invocation expression, the value of the reference type expression returned by the MemberExpression runtime semantic abstraction operation is obtained through GetValue in step 5.

Abstract operations are written OperationName(arg1, arg2) like functions and can take one or more arguments. They differ from normal JavaScript functions in that, They are not directly accessible in JavaScript, but are used as a writing convention to reuse a series of operations and algorithms in the ECMAScript specification text.

Records & Completion Records

In the previous abstract operations we noticed that some of the operations were preceded by? Mark? What does this mark mean? Operations defined in some specifications, like ECMAScript functions, need to handle different behavior of the control flow. For example, interrupt the execution of the function through the throws keyword with an exception value Error, or interrupt the execution of the function with the return keyword and return a return value. The Completion Record type is used in the ECMAScript specification to express different situations and their accompanying values. The Records type is an abstract type that is only used in the ECMAScript specification to express a set of data. Just like abstract operations, different JavaScript engines can have different implementations to represent the Record type. A Record value can contain one or more key-value pairs whose values can be normal EMCAScript values or other abstract types defined in ECMAScript. In specification text, it is common to use the double parenthesis notation [[Field]] to represent Field access to the Record. Completion Type As a specific Record type, the following table shows the key-value pairs defined by Completion Record.

Field Name Value Meaning
[[Type]] One of normal, break, continue, return, or throw The type of completion that occurred.
[[Value]] any ECMAScript language value or empty The value that was produced.
[[Target]] any ECMAScript string or empty The target label for directed control transfers.

Normal Completion Record [[Type]] is a normal Completion Record. Any type of Completion other than a Normal Completion can be called an Abrupt Completion. Most of the time, we will only come across an Abrupt Completion [[Type]], which is a throw. The other three Abrupt completions occur only when some specific grammar element is performed. In the ECMAScript specification text definition, there is no try-catch block similar to JavaScript code, and every possible error condition (or Abrupt Completion) needs to be handled explicitly. Without some handy way to handle these cases, error handling in any abstract operation would require the following four steps: get the return value; In the second step, determine whether the returned CompletionRecord is an Abrupt Completion, and if so, return the Completion Completion as the return value of the operation. Step 3 retrieves the return value of the package from the CompletionRecord; That’s where we start dealing with it. Like this description:

1. Let resultCompletionRecord be AbstractOp().
2. If resultCompletionRecord is an abrupt completion, return resultCompletionRecord.
3. Let result be resultCompletionRecord.[[Value]].
4. result is the result we need. We can now do more things with it.
Copy the code

After ES2016, several concise expressions have been added to the specification. The same text above can be written as the following three steps, where step 2 and Step 3 process all the Abrupt Completion by ReturnIfAbrupt. And automatically unpack [[Value]] of result.

1. Let result be AbstractOp().
2. ReturnIfAbrupt(result).
3. result is the result we need. We can now do more things with it.
Copy the code

Further, by introducing? The description of the operation is no longer needed to handle the CompletionRecord at all, and result is already the unpacked Value of [[Value]].

1. Let result be ? AbstractOp().
2. result is the result we need. We can now do more things with it.
Copy the code

With? Tokens are similar and appear in ECMAScript specification text! Notation, which is equivalent to the assertion that the return value for this operation must be Normal Completion.

1. Let val be ! OperationName().

// equivalent to ⬇️

Copy the code
  1. Let val be OperationName().
  2. Assert: val is never an abrupt completion.
  3. If val is a Completion Record, set val to val.[[Value]].

You can learn more about this in the ReturnIfAbrupt shorthand notation in the ECMAScript specification.

Object Indicates the internal slot

Back to attribute access, CallExpression needs to perform function call operations on a specific function at runtime, so in step 5 GetValue is used to get the value corresponding to the reference type returned by MemberExpression:

GetValue ( V ) 1. ReturnIfAbrupt(V). 2. If Type(V) is not Reference, return V. 3. Let base be GetBase(V). 4. If IsUnresolvableReference(V) is true, throw a ReferenceError exception. 5. If IsPropertyReference(V) is true, then a. If HasPrimitiveBase(V) is true, then i. Assert: In this case, base will never be undefined or null. ii. Set base to ! ToObject(base). b. Return ? base.[[Get]](GetReferencedName(V), GetThisValue(V)). 6. Else, a. Assert: base is an Environment Record. b. Return ? Base. GetBindingValue (GetReferencedName (V), IsStrictReference (V)) (see 8.1.1).Copy the code

As you can see from our example code, step 5.a performs ToObject abstraction on a primitive type value if the base of the attribute reference is a primitive type value, because a primitive type is not really like an object that has an internal store of cells for methods that can be overridden. So you need to convert it to an object modeled after string.prototype and then take its attributes. ToObject does different things depending on the type of the argument, such as creating a new String object with the String in the argument as the data source, by definition, for the String primitive type in our example. ToObject ( argument )

Argument Type Result
Undefined Throw a TypeError exception.
Null Throw a TypeError exception.
Boolean Return a new Boolean object whose [[BooleanData]] internal slot is set to argument. See 19.3 for a description of Boolean objects.
Number Return a new Number object whose [[NumberData]] internal slot is set to argument. See 20.1 for a description of Number objects.
String Return a new String object whose [[StringData]] internal slot is set to argument. See 21.1 for a description of String objects.
Symbol Return a new Symbol object whose [[SymbolData]] internal slot is set to argument. See 19.4 for a description of Symbol objects.
BigInt Return a new BigInt object whose [[BigIntData]] internal slot is set to argument. See 20.2 for a description of BigInt objects.
Object Return argument.

GetValue = GetValue; GetValue = GetValue; GetValue = GetValue; What does [[this mark]] stand for? As we mentioned earlier, ECMAScript uses [[notation]] to access one of the key-value pairs of the Record type. In addition, ECMAScript uses similar notation to access the internal slots of objects and internal methods, depending on where the notation appears in the context being used. To be sure, though, accessing properties through [[this token]] is a property that none of us can access or observe in JavaScript. In ECMAScript, each Object has a set of internal methods that are often called within various other abstract operations defined in ECMAScript. Common as:

  • [[Get]]To get a member attribute on an object (e.gobj.prop);
  • [[Set]]Is used to assign a value to a member attribute on an object (e.gobj.prop = 42);
  • [[GetPrototypeOf]]To get the prototype of the object (e.gObject.getPrototypeOf(obj));
  • [[GetOwnProperty]]The property descriptor used to get the object’s own property (e.ggetOwnPropertyDescriptor(obj, "prop"));
  • [[Delete]]To delete an attribute on an object (e.gdelete obj.prop).

Functions are objects with extra [[Call]] inner methods (and [[Construct]] inner methods), so functions can also be called callable objects. In addition to these internal methods, JavaScript objects have a number of internal slots, which the ECMAScript specification uses to store the object’s data. [[prototypeof]] [[Prototype]] [[Prototype]] [GetPrototypeOf]] Most objects have [[Prototype]] internal slots, but all objects implement [[GetPrototypeOf]] internal methods. For example, Proxy objects don’t have their own [[Prototype]] internal slots, but they implement [[GetPrototypeOf]] internal methods, This internal method proxies the call to the registered handler or to the [[GetPrototypeOf]] of the proxy object.

You can learn more about Object Internal Methods in Ordinary Object Internal Methods and Internal Slots.

You can learn more about the Internal Methods of Proxy external objects in Proxy Object Internal Methods and Internal Slots.

In addition, ECMAScript divides all objects into two types, normal and external. Most of the objects we use are Ordinary objects, which means that their Internal Methods are the default Methods defined in Ordinary Object Internal Methods and Internal Slots. In addition, we use many types of external objects that redefine the default internal methods of many ordinary objects, such as arr[1] = 123 or arr.length = 100 when we assign subscripts to Array types, [[DefineOwnProperty]] is used to redefine the Array external object type to perform additional operations on the object, such as group scaling.

You can learn more about the internal methods of Array external Objects in Array Exotic Objects.


We can better understand the relationship between these objects by looking at the following figure.

Graph: timothygu. Me/es – howto

Back to GetValue, after converting the base type to a normal object, GetValue gets the value of the object’s property in Step 5.b by calling the object’s [[Get]] internal method:

`[[Get]]` ( P, Receiver )
1. Return ? OrdinaryGet(O, P, Receiver).
Copy the code

You can see that the [[Get]] operation of the ordinary object delegates the concrete content to the OrdinaryGet abstract operation for processing. The OrdinaryGet abstraction operation iterates through the properties of the object and its stereotype chain until the desired properties are found (step 3, if none is found, the stereotype’s [[Get]] method is called).

OrdinaryGet ( O, P, Receiver )
1. Assert: IsPropertyKey(P) is true.
2. Let desc be ? O.[[GetOwnProperty]](P).
3. If desc is undefined, then
    a. Let parent be ? O.[[GetPrototypeOf]]().
    b. If parent is null, return undefined.
    c. Return ? parent.[[Get]](P, Receiver).
4. If IsDataDescriptor(desc) is true, return desc.[[Value]].
5. Assert: IsAccessorDescriptor(desc) is true.
6. Let getter be desc.[[Get]].
7. If getter is undefined, return undefined.
8. Return ? Call(getter, Receiver).
Copy the code

The Property Descriptor type is also a Record. In JavaScript we usually use Object literals such as Object.defineProperty(obj, ‘foo’, {enumerable: Different: false, value: ‘bar’}).

const it = 'foobar';
'foobar'.substring(3);
// -> 'bar'
Copy the code

So far, we can concluded that “foobar” in the process of the attributes of the substring access was transformed into a String object, and then through the String. The prototype access to the String. The prototype. The substring, Finally, by calling this function with the String received during the procedure as receiver 3, we can get the “bar” in the original example.

String.prototype.substring

In capturing the String. The prototype. The substring after this function, if we call and use undefined as to its receiver (this value of the function call) will happen?

String.prototype.substring.call(undefined.2.4)
Copy the code

Based on our past experience with JavaScript, we speculate that there are probably two possibilities:

  • String.prototype.substring()undefinedTo a string type"undefined", and then take the character with index 2 of the string to the character with index 4 (the range of index [2, 4)), the final result is “de”.
  • String.prototype.substring()Throws an error that rejects undefined as Receiver input.

Unfortunately there is no detailed description of this in MDN, but if you are interested, check out the ECMAScript definition to see what the result will be.

More links

  1. Timothy Gu, How to Read the ECMAScript Specification, timothygu.me/es-howto
  2. TC39, ECMAScript® 2020 Language Specification – Draft, March 10, 2020, TC39. Es/ECMA262
  3. Marja Holtta, Understanding the ECMAScript Spec, Part 1, v8.dev/blog/unders…
  4. Marja Holtta, Understanding the ECMAScript Spec, Part 2, v8.dev/blog/unders…