Generator from shallow to deep (1)

preface

Generator Function is an asynchronous process control solution provided by ES6. Prior to asynchronous programming, there were callback functions, event listeners, publish/subscribe, promises, and so on. However, if you think about the previous solution, it is still based on the callback function and does not change the asynchronous writing from the syntactic structure.

Unlike normal functions, a Generator Function can be paused during execution and then resumed from where it was paused. Execution of a function is typically relinquished during an asynchronous operation and resumed at the same location after completion. The new syntax makes it easier to handle asynchronous tasks synchronously in asynchronous scenarios.

We’ve written about Promise solutions and implementation internals. Following on from above, this article focuses on iterator correlation, Generator Function syntax, yield operators, use of asynchronous scenarios, common autoactuators, Babel translation, etc.

Note that Generator functions are translated as Generator functions, and generators are briefly described in individual places.

The iterator

Before we look at generator functions, it’s important to look at iterators. An iterator is a special object with a next() method designed specifically for an iterative process. Each call to next() returns an object containing the value and done attributes. The ECMAScript document The IteratorResult Interface reads:

Done (Boolean type)
- Done is true if the iterator traverses to the end of the iteration sequence
- Done is false if the iterator can continue traversing the sequence
Value (any type)
- If done is false, the value is the current iteration element value
- If done is true and there is a return value in the iterator
- Undefined if no value is returned

A simple example of creating a compliant iterator interface using ECMAScript 5 syntax:

function createIterator (items) {
  var i = 0

  return {
    next: function () {
      var done = (i >= items.length)
      varvalue = ! done ? items[i++] :undefined

      return {
        done: done,
        value: value
      }
    }
  }
}

var iterator = createIterator([1.2])

console.log(iterator.next())    // {done: false, value: 1}
console.log(iterator.next())    // {done: false, value: 2}
console.log(iterator.next())    // {done: true, value: undefined}
Copy the code

Standard for loop code uses variables such as I or j to indicate the internal index, increasing and decreasing each iteration to maintain the correct index value. The loop statement syntax is simple compared to iterators, but the code complexity can be greatly increased if you need to set up multiple index variables to track multiple loop nesting. Iterators can eliminate some of this complexity and reduce errors in the loop.

In addition, iterators provide a consistent iterator protocol compliant interface to unify iterable traversal. For example, for… The of statement can be used to iterate over an iterable containing an iterator (such as Array, Map, Set, String, etc.).

The generator

A generator is a function that returns an iterator, indicated by the function keyword followed by an asterisk (*), plus the new yield keyword. Rewrite the above example as a generator function.

function *createIterator (items) {
  for (let i = 0; i < items.length; i++) {
    yield items[i]
  }
}

const iterator = createIterator([1.2])

console.log(iterator.next())    // {done: false, value: 1}
console.log(iterator.next())    // {done: false, value: 2}
console.log(iterator.next())    // {done: true, value: undefined}
Copy the code

In the above code, the asterisk (*) indicates that createIterator is a generator function, and the yield keyword is used to specify the return value and the order in which the iterator’s next() method is called.

Calling a generator function does not immediately execute the internal statement, but instead returns an iterator object for the generator. When the iterator first calls the next() method, it executes internally until the statement after yield. Calling next() again continues execution from the statement after the current yield until it pauses at the next yield.

Next () returns an object containing the value and done attributes. The value attribute indicates that the yield expression returns a value, and done indicates whether there are any subsequent yield statements, that is, whether the generator function has completed execution.

Generator related methods are as follows:

Generator.prototype.next(), which returns a value generated by the yield expression
The Generator. The prototype. The return (), returns the value of a given Generator and an end
The Generator. The prototype. Throw (), to the Generator throw an error

Generator functions inherit from Function and Object. Unlike normal functions, generator functions cannot be called as constructors and simply return generator objects. The complete generator object diagram looks like this:

The yield keyword

The yield keyword can be used to pause and resume a generator function. The value of the expression following yield is returned to the caller of the generator, and yield can be considered based on the return keyword of the generator version. The yield keyword can be followed by any value or expression.

Once a yield expression is encountered, the generator’s code is paused until the generator’s next() method is called. Each time a generator’s next() method is called, the generator continues execution at the statement immediately following yield. Stop until the next yield or exception is thrown internally by the generator or the end of the generator function or return statement is reached.

Note that the yield keyword can only be used within a generator; using it elsewhere will result in a syntax error. This is true even when used within a generator internal function.

function *createIterator (items) {
  items.forEach(item= > {
    // Syntax error
    yield item + 1})}Copy the code

Alternatively, yield * can be used to declare a delegate Generator, that is, to call another Generator function inside a Generator function.

Next method

Generator.prototype.next() returns an object containing the properties done and value, and can also take a parameter to pass to the Generator. The returned value object contains the same done and value meanings as in the iterator section, and there is nothing more to say. Of interest, the next() method can take an argument that replaces the return value of the previous yield statement inside the generator. Undefined if no yield statement is passed. Such as:

function *createIterator (items) {
  let first = yield 1
  let second = yield first + 2
  yield second + 3
}

let iterator = createIterator()

console.log(iterator.next())    // {value: 1, done: false}
console.log(iterator.next(4))   // {value: 6, done: false}
console.log(iterator.next())    // {value: NaN, done: false}
console.log(iterator.next())    // {value: undefined, done: true}
Copy the code

One exception is that whatever arguments are passed to the next() method are discarded the first time it is called. Because the argument passed to the next() method replaces the return value of the previous yield, and no yield statement is executed until the first call to the next() method, passing the argument on the first call is meaningless.

In fact, the ability to pass values inside an iterator is very important. For example, in an asynchronous process, the generator function suspends at the yield keyword, and after the asynchronous operation is complete, the current asynchronous value must be passed for use by subsequent iterator processes.

Asynchronous flow control

Generator functions can pause and resume execution, and next() can exchange data in and out of functions, making it a complete solution for asynchronous programming. Take an asynchronous scenario as an example:

function *gen () {
  const url = 'https://api.github.com/user/github'
  const result = yield fetch(url)
  console.log(result.bio)
}
Copy the code

In the above code, the Generator function encapsulates an asynchronous request operation. The above code looks a lot like a synchronous operation, except for the addition of the yield keyword. However, running the above code also requires a piece of executor code.

const g = gen()
const result = g.next()

result.value.then(function (data) = >{
  g.next(data.json())
})
Copy the code

The executor-related code executes Generator functions to fetch the traverser object, then uses next() to perform the first phase of the asynchronous task, and calls the next method in the promise.then method returned by fetch to perform the second phase operation. As you can see, while Generator functions represent asynchronous operations concisely, process management is inconvenient and requires additional manual run-time code.

In order to avoid additional manual process management, automatic execution functions are often introduced to assist execution. If all operations after the yield keyword in a generator function are synchronous, it is easy to recursively determine whether the return value done is true and run until the end of the function. More complex are asynchronous operations, which require the iterator next(data) method to be executed after asynchronous completion, passing the asynchronous result and resuming subsequent execution. However, how to execute next() when asynchrony completes requires an agreement in advance on the asynchronous form of operation.

Common automatic process management are Thunk function mode and CO module. Co also supports Thunk functions and Promise asynchronous operations. Before I go on to explain the automatic process management module, let me briefly say the Thunk function.

In JavaScript, a Thunk function is a function that replaces a multi-argument function with a single-argument function that takes only a callback as an argument. The conversion process, similar to the currization of a function, transforms the acceptance of multiple parameters into the acceptance of a single parameter function. Take node asynchronously reading files as an example:

// Normal version of readFile (multiple arguments)
fs.readFile(fileName, callback)

// Thunk readFile (single argument)
const Tunk = function (fileName) {
  return function (callback) {
    return fs.readFile(fileName, callback)
  }
}

const readFileThunk = readFileThunk(fileName)
readFileThunk(callback)
Copy the code

Any function argument that contains a callback can be written as a Thunk function. A simple Thunk function converter similar to the function Keriification process is shown below. The build environment recommends the Thunkify module, which can handle more exception boundary cases.

// Thunk converter
const Thunk = function (fn) {
  return function (. args) {
    return function (callback) {
      return fn.call(this. args, callback) } } }// Generate the fs.readFile Thunk function call
const readFileThunk = Thunk(fs.readFile)
readFileThunk(fileA)(callback)
Copy the code

Automatic process management

To introduce automatic process management based on the Thunk function, we have a convention that an expression after the yield keyword returns a function that takes only callback arguments, namely the Thunk function. Thunk Generator-based simple autoactuators are as follows.

function run (fn) {
  var gen = fn()
  
  function next (err, data) {
    var result = gen.next(data)
    
    if (result.done) return

    result.value(next)
  }
  
  next()
}
Copy the code

In the autoexecutor function above, the iterator is first run at the first yield expression, which returns a function that takes only callback and executes the next() recursive method as a callback input. Resume execution of the generator function when the callback is called back after the asynchronous processing is complete.

The other is an automatic execution mechanism based on Promise objects. In fact, the CO module also supports Thunk function and Promise object, two modes of automatic process management. The former wraps the asynchronous operation as a Thunk function and returns execution rights in the callback, while the latter wraps the asynchronous operation as a Promise object and returns generator execution rights in the then function.

Following the example above, wrap the FS module’s readFile method as a Promise object.

const fs = require('fs')

const readFile = function (fileName) {
  return new Promise(function (resolve, reject) {
    fs.readFile(fileName, function (err, data) {
      if (err) reject(err)

      resolve(data)
    })
  })
}
Copy the code

Whereas the Thunk pattern handles recursion in the callback, the autoexecutor of the Promise object invokes recursion in the THEN method. The simple implementation is:

function fun (gen) {
  const g = gen()

  function next (data) {
    var result = g.next(data)

    if (result.done) return result.value

    result.value.then((function (data) {
      next(data)
    }))
  }

  next()
}
Copy the code

A glance through the CO documentation shows that post-yield objects support many forms: Promises, Thunks, Array (Promise), objects (Promise), generators, and generator functions. The general implementation principle is consistent with the above, here is not posted CO module source code. More information can be found at https://github.com/tj/co.

The state machine

Similar to Promise, Generator Instances are finite state machines. Check the ECMAScript Properties of Generator Instances. The generator function has five states: undefined, suspendedStart, suspendedYield, executing and completed.

It is semantically easy to understand that the internal state changes as the generator function runs. However, the details of how the internal state of the Generator changes will not be further written here, and will be explained in the next article in combination with the Generator ES5 runtime source code.

Regenerator converter

Because the browser environment is inconsistent, not all native support Generator functions, generally use the Babel plug-in Facebook/Regenerator to compile into ES5 syntax, to achieve compatibility with earlier versions of the browser. Regenerator provides transform and Runtime packages for Babel transcoding and runtime support, respectively.

Unlike Promise objects that can be run with the introduction of a Ployfill spacer, Generator functions are new syntactic constructs that cannot be run in earlier versions simply by adding run-time code. During the compilation phase, the corresponding abstract syntax tree (AST) needs to be processed to generate ES5 syntax structures that conform to the runtime code. In the runtime phase, runtime functions are added to assist the compiled statement execution.

The ReGenerator website provides visual operations. An example before and after simple AST transcoding is as follows:

function *gen() {
  yield 'hello world'
}

var g = gen()

console.log(g.next())
Copy the code

var _marked = regeneratorRuntime.mark(gen);

function gen() {
  return regeneratorRuntime.wrap(function gen$(_context) {
    while (1) {
      switch (_context.prev = _context.next) {
        case 0:
          _context.next = 2;
          return 'hello world';

        case 2:
        case "end":
          return _context.stop();
      }
    }
  }, _marked, this);
}

var g = gen();
console.log(g.next());
Copy the code

The regenerator-Transform plug-in handles the AST syntax structure, and regeneratorRuntime provides support for run-time regeneratorRuntime objects. This covers how Babel transcodes and how the runtime framework works, which will be covered in a later article. See the Facebook/Regenerator project for the source code.

Babel-runtime, babel-plugin-transform-Runtime, babel-polyfill, babel-runtime, babel-plugin-transform-Runtime Babel is only responsible for es syntax conversions and does not convert new objects or methods, such as promises, array.from, and so on. Babel-polyfill or babel-Runtime can be used to simulate the implementation of the corresponding object.

The difference between babel-polyfill and babel-Runtime is that polyfill introduces new global objects, modifying and contaminating objects in the original global scope. Runtime extracts global built-in objects that developers rely on into separate modules and imports them through module imports to avoid global scope contamination.

The difference between babel-Runtime and babel-plugin-transform-Runtime is that the former is a functional module that actually imports project code, while the latter is a runtime code extraction transformation for the build process that references the required runtime code from babel-Runtime.

You can refer to two articles, use of Babel-Polyfill and performance optimization, use of Babel-Runtime and performance optimization.

Relationship between Generator and coroutine

Ruan’s book describes the corresponding relationship, which can be viewed in the section Generator functions. The front end rarely involves process, thread, coroutine knowledge points, here will not repeat.

Iterable protocol and iterator protocol

I talked about iterators, and explained the iterable protocol and the iterator protocol.

The iterable protocol allows JavaScript objects to define their iterative behavior, as in for… What values in the of structure can loop. Common data types have built-in iterables and have default iteration behavior, such as Array and Map. Note that Object does not use for by default… Of traversal.

In order to become iterable, an object must implement the @@iterator method. You can set the symbol. iterator property on this object (or on an object in the prototype chain) to return a no-argument function that conforms to the iterator protocol object.

Moving on to the iterator protocol, which defines a standard way to generate sequence values. That is, the iterator object must implement the next() method and next() contains both done and value attributes. The two properties are the same as above, explained in detail earlier.

In short, the iterable must satisfy the iterable protocol with the symbol. iterator method, which returns the iterator protocol conforming object, including the next method.

For example, an Object does not have iterator methods by default. Of traversal. We can modify the Object prototype to add iterator methods to access the corresponding key and value property values.

Object.prototype[Symbol.iterator] = function () {
  let i = 0 
  let done, value
  const items = Object.entries(this)

  return {
    next: function () {
      done = (i >= items.length)
      value = done ? undefined : {
        key: items[i][0].value: items[i][1]
      }
      i += 1

      return {
        done: done,
        value: value
      }
    }
  }
}

const obj =  {
  name: 'spurs'.age: '23'
}

for (let item of obj) {
  console.log(item)
}

// {key: "name", value: "spurs"}
// {key: "age", value: "23"}
Copy the code

conclusion

Long-winded, wrote an entry level article, many places are skimmed over. However, this article will focus only on the introduction to Generator syntax and will be followed by a Generator build and runtime source code analysis.

Currently, the best solution for asynchronous processes is the async/await combination, which has clearer semantics and does not require additional automatic execution modules. But Generator is essentially a syntactic sugar, and a better understanding of Generator functions gives you an insight into the evolution of asynchronous flow control from its roots.

Finally, if there are mistakes, please correct them.

The appendix

Reference documentation

ECMA262 Generator
MDN Generator
MDN yield

Reference books

Deep Understanding of ES6
Introduction to ES6 Standards (3rd Edition)

Welcome to follow the author’s public account