For the front end of x years of experience, the project has done a number of, each scene also touched some. But if you really want to open up to the interviewer to tear principles, or a little panic. See a lot of gods in hand tore a variety of framework principles or a little envy of their technical strength, envy is not as good as action, first solid gnaw foundation. B: well… Today we’re going to talk about closures!

About the closure of the article you may have read dozens of it, and can also find that some articles (I did not say all) are a routine, basically are concerned about two points, what is the closure, closure for example, very suspected of a porter. After reading these articles, one of my big feelings is: if I were to explain the knowledge of closures to someone, would I be able to explain it clearly? What do I base this on? How credible is it? I think I doubted myself, and it was all right to deny the triple estimate.

Different stages do different things, when there are some foundation, we can properly study the principle, do not float on the surface of the problem! So the general level of technology, how should we do, how to break through from these cluttered articles? I think one way is to look for clues from some authoritative documents, such as the ES specification, MDN, Wikipedia, etc.

There are always different interpretations of closures.

In the first case, a closure is a combination of a function and the lexical context in which it is declared. This term comes from mdN-closures.

Another way of saying it is that a closure is a function that has access to a variable in the scope of another function.

From what I understand, I think the first statement is true. A closure is not a function, but a function and its lexical context. Is the second statement true? I think it’s half true. In the closure scenario, there is a function that has access to variables in the scope of another function, but the closure is not a function.

Is that it? Obviously not! Decode the closure, this time we’ll get to the bottom of it!

This article will look directly at the ECMAScript5 specification to interpret some of the internal implementation logic of the JS engine and take a fresh look at closures based on these insights.

So again, what is the Lexical Environment that I mentioned earlier?

Lexical environment

Take a look at section 2, lexical environments, in Chapter 10 (Executable code and execution context) of the ES5 specification.

A Lexical Environment is a specification type used to define the association of Identifiers to specific variables and functions based upon the lexical nesting structure of ECMAScript code.

A lexical environment is a specification type that defines the connection between identifiers and specific variables and functions in ECMAScript code.

The question is, what is a specification type? Specification type is a type of specification. From ES5 specification, we can see that Type is divided into two categories: language types and Specification types.

Language Types are the familiar types that programmers using ECMAScript can manipulate, including Undefined, Null, Number, String, Boolean, and Object.

Specification Type is a more abstract meta-value used to describe the specific semantics of ECMAScript language structure and language types in the algorithm.

A specification type corresponds to meta-values that are used within algorithms to describe the semantics of ECMAScript language constructs and ECMAScript language types.

As for what meta value is, I think it can be understood as metadata. What is metadata? Why do you need metadata?

In general, metadata is data used to describe data. The analogy is that a high-level language always needs to be described and expressed in a lower level language and data structure. This is what the JS engine does.

With a general understanding of what a specification type is, we can’t help but ask: What does a specification type consist of?

The specification types are Reference, List, Completion, Property Descriptor, Property Identifier, Lexical Environment, and Environment Record.

Seeing here I seem to understand something, the old Lexical Environment and Environment Record are a specification type, sure enough is a lower concept.

So let’s forget about the List, Completion, Property Descriptor, Property Identifier and so on, and let’s look at the Lexical Environment which is the specification type.

The following sentence explains exactly what a lexical context contains:

A Lexical Environment consists of an Environment Record and a possibly null reference to an outer Lexical Environment.

The lexical Environment contains an Environment Record and a reference to the external lexical Environment, which may have a value of null.

The structure of a lexical environment is as follows:

Lexical Environment
  + Outer Reference
  + Environment Record
Copy the code

The Outer Reference points to the Outer lexical environment, which also indicates that the lexical environment is a linked list structure. Draw a simple structure diagram to help understand below!

Usually a Lexical Environment is associated with some specific syntactic structure of ECMAScript code such as a FunctionDeclaration, a WithStatement, or a Catch clause of a TryStatement and a new Lexical Environment is created each time such code is evaluated.

Typically, lexical environments are associated with some particular syntactic structure of ECMAScript code (such as FunctionDeclaration, Catch clause in a WithStatement, or TryStatement), and a new lexical environment is created each time such code is evaluated.

PS: Evaluated is the past participle of evaluated, which literally means evaluated, and evaluated code is not very easy to understand. My personal understanding is that the evaluation code represents javascript code being interpreted and executed by the JS engine.

As we know, executing a function creates a new lexical environment.

We also agree that the with statement “extends” the scope (essentially calling NewObjectEnvironment, creating a new lexical environment whose environment record is an object environment record).

These are the things we can understand. So what does the catch clause do to the lexical environment? Although try-catch is used a lot, the details of lexical environment are not noticed by many people, including me!

We know that the catch clause will have an error object e

function test(value) {
  var a = value;
  try {
    console.log(b);
    // A direct reference to a non-existent variable will result in a ReferenceError
  } catch(e) {
    console.log(e, arguments.this)
  }
}
test(1);
Copy the code

Print Arguments in a catch clause just to prove that the catch clause is not a function. Because if catch is a function, then obviously arguments printed here should not be arguments for test. Since a catch is not a function, why should there be an error object e that can only be accessed within the catch clause?

The answer is to catch clause using NewDeclarativeEnvironment created a new lexical environment (catch clause in lexical environment the external morphology of the reference is to function test of lexical environment), The identifier E is then associated with the environment record of the new lexical environment by a CreateMutableBinding and a SetMutableBinding.

The initialization part of the for loop can also define variables through var. Is there any essential difference between the catch clause and the initialization part of the for loop? Note that prior to ES6 there was no block-level scope. Variables defined by var in a for loop are, in principle, part of the lexical environment of the function in which they are defined. If The for statement is not used in a function, then The variables defined by var are part of The Global Environment.

The conclusion that a new lexical environment has been established in the with and catch clauses, Evaluated a new circumsistant Environment is created each time such code is evaluated Statement and 12.14 The try Statement.

Environment Record

Having understood Lexical Environment, let’s talk about the Environment Record in Lexical Environment. The environment record is closely related to the variables and functions we use, so to speak, the environment record is their underlying implementation.

The specification description of the environment record is too long and will not be copied here. Please go to section 10.2.1 of the ES5 specification.

There are two kinds of Environment Record values used in this specification: // I’ll declarative environment records and object environment records

From the specification, we can see that the Environment Record is divided into two types:

  • I’m going to declarative environment records
  • Object Environment Records Object environment records

The ECMAScript specification stipulates that both declarative and object environment records must implement some common abstract methods of the environment record class, even though they may differ in implementation algorithms.

These common abstract methods are:

  • HasBinding(N)
  • CreateMutableBinding(N, D)
  • SetMutableBinding(N,V, S)
  • GetBindingValue(N,S)
  • DeleteBinding(N)
  • ImplicitThisValue()

Declarative environment records should also implement two unique methods:

  • CreateImmutableBinding(N)
  • InitializeImmutableBinding(N,V)

ImmutableBinding: ImmutableBinding: ImmutableBinding: ImmutableBinding

If strict is true, then Call env’s CreateImmutableBinding concrete method passing the String “arguments” as the argument.

Call env ‘s InitializeImmutableBinding concrete method passing “is the arguments” and argsObj as the arguments. Else, Call env’ s CreateMutableBinding Concrete method passing the String “arguments” as the argument. Call env’s SetMutableBinding concrete method passing “arguments”, argsObj, and false as arguments.

That is, immutable binding is used for the arguments object of a function only in strict mode. A variable with ImmutableBinding means it cannot be reassigned. Here’s an example:

You can change arguments in non-strict mode:

function test(a, b) {
  arguments = [3.4];
  console.log(arguments, a, b)
}
test(1.2)
// [3, 4] 1
Copy the code

In strict mode, changing arguments will cause an error:

"use strict";
function test(a, b) {
  arguments = [3.4];
  console.log(arguments, a, b)
}
test(1.2)
// Uncaught SyntaxError: Unexpected eval or arguments in strict mode
Copy the code

Note that I am referring to changing arguments, not modifying arguments. Arguments [2] = 3 this operation does not report errors in strict mode.

So ImmutableBinding restricts the immutability of the reference, not the immutability of the object to which the reference refers.

declarative environment records

When we use variable declaration, function declaration and catch clause, we will establish corresponding declarative environment record in JS engine. They directly associate Identifier Bindings with ECMAScript language values.

object environment records

Object Environment Records, which include Program, WithStatement, and global environment records described below. They associate identifier Bindings with attributes of certain objects.

What is identifier Bindings?

After looking at the abstract approach to the Environment Record mentioned in the ES5 specification, I have a general answer.

Let’s take a quick look at the process of valuing and assigning javascript variables:

var a = 1;
console.log(a);
Copy the code

One of the steps we use to initialize variable A and assign a value of 1 in the JS engine is to perform CreateMutableBinding and SetMutableBinding.

When the variable a is valued, it is reflected in the JS engine that GetBindingValue (get the bound value) is executed. In these execution processes, there will be some assertions and judgments, and judgment of strict mode will also be involved. See 10.2.1.1 Declarative Environment Records for details.

Some steps are omitted, such as GetIdentifierReference, GetValue(V), PutValue(V), etc.

As I understand, identifier bindings are a set of bindings maintained in the JS engine, which can be associated with javascript.

The Global Environment

The Global Environment is a special lexical Environment that is created before ECMAScript code executes. An Environment Record in the Global Environment is an object Environment Record that is bound to a Global object and reflected in the browser Environment. Associated with the Global Object is the Window Object.

The global environment is a top-level lexical environment, so the global environment no longer has an external lexical environment, or its external lexical environment has a reference to null.

The 15.1 The Global Object section also explains some of The details of The Global Object, such as why you can’t new Window() and why Global objects are very different in different hosting environments……

Execution context

After looking at this, we still don’t have a complete understanding of closures, so let’s move on to the execution context. In my previous understanding, a context should be an environment that contains variables that the code can access. Of course, this is obviously not comprehensive. So what exactly is context?

When control is transferred to ECMAScript executable code, control is entering an execution context. Active execution contexts logically form a stack. The top execution context on this logical stack is the running execution context.

When program control is transferred to an ECMAScript executable code, an execution context is entered, which is a logical Stack structure. The execution context at the top of the stack is the execution context that is running.

A lot of people may be wondering about executable code. Isn’t javascript all executable code? No, such as comments and White Spaces are not executable code.

An execution context contains whatever state is necessary to track the execution progress of its associated code.

An execution context contains states that track the execution process of the code associated with it. Each Execution Context has these Execution Context State Components.

  • LexicalEnvironment: LexicalEnvironment
  • VariableEnvironment: VariableEnvironment
  • ThisBinding: This keyword directly associated with the execution context

Execution context creation

As we know, interpreting global Code or using eval Function, calling a function creates a new execution context, which is a stack structure.

When control enters an execution context, the execution context’s ThisBinding is set, its VariableEnvironment and initial LexicalEnvironment are defined, and declaration binding instantiation (10.5) is performed. The exact manner in which these actions occur depend on the type of code being entered.

When the control program enters the execution context, the following three actions occur:

  1. The value of the this keyword is set.
  2. Both the VariableEnvironment (unchanged) and the Initial LexicalEnvironment (which may change, so initial) are defined.
  3. The declarative binding initialization is then performed.

The details of how these actions are executed depend on the type of code (global code, Eval Code, function Code).

PS: In general, the VariableEnvironment and LexicalEnvironment are initialized the same, the VariableEnvironment does not change, while the LexicalEnvironment may change during code execution.

So what happens to the execution context when you go into global code, eval code, function Code? 6. Establishing an Execution Context

The linked list structure of lexical environments

As mentioned above, a lexical environment is a linked list structure.

It is well known that many people refer to the concept of Scope Chain when understanding closures, leading to VO (variable object) and AO (live object). However, when I read the ECMAScript specification, I did not find these keywords throughout. I was wondering, is the list structure of lexical environments what they mean by a chain of scope? Is VO, AO an outdated concept? But these concepts seem to become “authority”, a search of relevant articles, all say VO, AO, I really want to understand this?

In ECMAScript, Table 9 Internal Properties Only Defined for Some Objects in 8.6.2 Object Internal Properties and Methods [[Scope]] specifies the Scope of the function.

In this Table, we can clearly see that the Value Type Domain column of [[Scope]] is a Lexical Environment, which indicates that [[Scope]] is a Lexical Environment. Let’s move on to Description:

A lexical environment that defines the environment in which a Function object is executed. Of the standard built-in ECMAScript objects, only Function objects implement [[Scope]].

If you look closely, [[Scope]] is the context in which the function object is executed, and only the function implements the [[Scope]] property, which means that [[Scope]] is a function-specific property.

The Scope Chain is the Chain of lexical environments that a function can access during execution. In a broad sense, the lexical environment list includes not only the scope chain, but also the lexical environment in WithStatement and Catch clauses, and even the block-level lexical environment of ES6. So ECMAScript is pretty rigorous!

Since there is no official explanation for VO and AO, the two relatively old concepts are basically “a thousand Readers, a thousand Hamlets”. I think it is possible to understand them this way:

  • VO is a product of the Lexical Parsing stage
  • An AO is a product of the Execution phase of code

There is no such word in the ES5 and ES6 specifications, so forget about VO and AO!

closure

What is a closure?

At the beginning of this article, we mentioned that closures are made up of functions and lexical environments. Here is a reference to the closure explanation from Wikipedia.

In computer science, closures, also known as Lexical closures or function closures, are a technique for implementing Lexical binding in programming languages that support first-class functions. A closure is implemented as a structure that stores a function (usually its entry address) and an associated context (equivalent to a symbol lookup table). An environment is a correspondence between a number of symbols and values, including both constraint variables (symbols bound within the function) and free variables (defined outside the function but referenced within the function). Some functions may not have free variables. A major difference between closures and functions is that when a closure is captured, its free variables are determined at capture time, so that it works even outside the context in which it was captured.

This is a computer science explanation of what a closure is, and of course this applies to javascript as well!

It mentions the word “free variable,” which is the variable we focus on in the lexical context of closures.

How does Chrome define closures?

Chrome seems to have become the standard on the front end, so how do you determine closures in Chrome? Might as well to explore below!

function test() {
  var a = 1;
  function increase() {
    debugger;
    var b = 2;
    a++;
    return a;
  };
  increase();
}
test();
Copy the code

I put the debugger inside the inner function increase. If we look directly at the highlight on the right, we can see that there is a Closure in Scope. The Closure name is the name of the outer function test, and the variable A is defined in the function test. The variable b is Local as a Local variable.

PS: For local variables, see localEnv.

What if I define another variable c in the outer function test, but don’t refer to it in the inner function increase?

function test() {
  var a = 1;
  var c = 3; // c is not in the closure
  function increase() {
    debugger;
    var b = 2;
    a++;
    return a;
  };
  increase();
}
test();
Copy the code

Verify that the internal function increase executes without the variable C in the closure.

We can also verify that if the inner function increase does not refer to any variables in the outer function test, no closure will be generated.

So at this point, we can conclude that the necessary conditions for closures to occur are:

  1. There is function nesting;
  2. Nested inner functions must refer to variables defined in outer functions;
  3. Nested inner functions must be executed.

A favorite closure that interviewers ask about

During the interview, we are often asked about closure scenarios where the inner function references variables of the outer function and is the return value of the outer function. This is a special closure. Here’s an example:

function test() {
  var a = 1;
  function increase() {
    a++;
  };
  function getValue() {
    return a;
  }
  return {
    increase,
    getValue
  }
}
var adder = test();
adder.increase(); / / from 1-6
adder.getValue(); / / 2
adder.increase();
adder.getValue(); / / 3
Copy the code

In this example, we find that each time the adder.increase() method is called, the value of a is increased by one, that is, the variable a is kept in memory without being freed.

So what’s behind this phenomenon?

Closure analysis

Since closures are a memory issue, V8’s GC (garbage collection) mechanism has to be mentioned.

The most widely read GC policy is reference counting, but modern mainstream VMS (including V8, JVMS, etc.) do not use reference counting collection strategies, but use reachable algorithms.

Reference counting is easier to understand, so it’s often seen in textbooks, but there can be problems with objects referencing each other and not freeing their memory. The reachability algorithm starts from the GC Roots object (such as the global object window) to search for the surviving (reachable) objects. The unreachable objects will be reclaimed, and the surviving objects will undergo a series of processing.

There is a very good article about some of the details of the V8 GC algorithm by the author of the film development, which I highly recommend to check out, attached to the resources at the end of the article.

In the particular closure scenario we are concerned with, the reason the closure variables remain in memory is because the lexical context of the closure has not been freed. Let’s analyze the execution first.

function test() {
  var a = 1;
  function increase() {
    a++;
  };
  function getValue() {
    return a;
  }
  return {
    increase,
    getValue
  }
}
var adder = test();
adder.increase();
adder.getValue();
Copy the code
  1. Initially execute global Code, create global execution context, then setthisThe value of the keyword iswindowObject to create a Global Environment. Global object hasadder.testEqual variable and function declarations.

  1. Start to performtestFunction, entertestFunction execution context. intestDuring the execution of a function, variables are declaredaThe functionincreaseandgetValue. The result is an object whose two properties refer to the functionincreaseandgetValue.

  1. exittestFunction execution context,testThe result of a function is assigned to a variableadder, the current execution context is restored to the global execution context.

  1. calladdertheincreaseMethod, enterincreaseThe execution context of the function, executing the code to make the variableaSince the increase1.

  1. exitincreaseThe execution context of the function.
  2. calladderthegetValueMethod, its procedure and invocationincreaseMethod is similar.

With some understanding of the entire execution process, it also seems difficult to explain why the variable A in the closure is not collected by GC. Only the fact that increase and getValue depend on the variable A defined in test is clear, but that alone is not a convincing reason either.

So let’s ask the question, how does the code resolve the identifier a?

By reading the specification, we can know that identifiers are resolved through GetIdentifierReference(lex, Name, strict), where LEX is the lexical environment, name is the identifier name, and strict is the Boolean flag of strict mode.

So how do we parse the identifier A when we do the function increase? Let’s analyze it!

  1. First of all, letlexIs the functionincreasethelocalEnv(the local environment of the function), throughGetIdentifierReference(lex, name, strict)inlocalEnvResolves the identifier ina.
  2. According to theGetIdentifierReferenceThe execution logic inlocalEnvThe identifier cannot be resolveda(because theaNot in the functionincrease, this is obvious), so it will go tolocalEnvAnd the external lexical environment is actuallyincreaseThe function’s internal attribute [[Scope]] (which I got from looking at the specification definition many times), istestFunction of thelocalEnvgimped“.
  3. Go back to the execution functiontestIn that step, we execute the functiontestAfter the functiontestIn thelocalEnvThe binding of all other variables in theaBinding cannot be released because there are other lexical environments (increaseThe function’s internal property [[Scope]]) is referenceda.
  4. Lexical context and functions for closurestestThe implementation of thelocalEnvIt’s different. functiontestWhen executed, thelocalEnvWill completely reinitialize again, and exit the functiontestAfter the execution context of the bindings, the lexical environment of the closure only keeps a part of the bindings in its environmental record, which will be referenced by other lexical environments, so I call it “castration version”.

Some of you may have asked (and I have asked myself) why is adder.increase() called in a global execution context when the external lexical environment is still a castration of the test function’s localEnv?

Which brings us back to the definition of an external lexical context reference, which refers to the lexical context that logically surrounds the internal lexical context!

The outer reference of a (inner) Lexical Environment is a reference to the Lexical Environment that logically surrounds the inner Lexical Environment.

Pros and cons of closures

The articles on the Internet are quite detailed about this, so I won’t give you any examples in this article. In general, closures have these advantages:

  • Variables live in memory, which can be useful for certain businesses, such as counters.
  • Builds a bridge that allows external access to internal variables of a function.
  • Privatization, a certain program to solve the problem of naming conflicts, can be implemented private variables.

Closures are a double-edged sword, with one obvious drawback:

  • It is possible that variables are resident in memory and cannot be reclaimed by GC, resulting in memory overflows.

summary

This article starts with the ECMAScript specification and reveals the mystery of closures step by step. From the definition of closures, we learn the lexical environment, which leads to the concepts of environmental records, external lexical references, and execution context. After having doubts about VO, AO and other old concepts, I chose to look for clues from the specifications and finally got a clue. When interpreting closures, I searched for various materials, started from the general definition of closures in computer science, mapped some key concepts to javascript, and combined with some knowledge points of GC, finally got the answer.

It took a while to write this article because some descriptions have to be objective and rigorous when it comes to the ECMAScript specification. There must be subjective components in the process of interpretation. If there are any mistakes, please point them out!

Finally, I highly recommend that you read the ECMAScript specification at your leisure. Reading language norms is a great way to get a better understanding of the basics of a language. For example, if we don’t know the execution logic of an operator, it is safest to look directly at the language specification!

At the end is a picture that will help you understand the ECMAScript specification.

If convenient, help me point like yo, thank you! Welcome to add my Wechat Laobaife exchange, technical friends, chat is also available.

The resources

  • ECMAScript specification
  • Wikipedia: First class objects
  • First class functions
  • Wikipedia: closures
  • Reading THE V8 GC Log (2) : The partition of memory in and out of the heap and the GC algorithm
  • What are the mainstream garbage collection mechanisms?
  • V8 Memory Analysis
  • How does reference counting maintain all object references in garbage collection?
  • A tour of V8: Garbage Collection