Background: Lazy evaluation?

Take a look at an example from the lazy.js home page:

var people = getBigArrayOfPeople();
var results = _.chain(people)
  .pluck('lastName')
  .filter(function(name) { return name.startsWith('Smith'); })
  .take(5)
  .value();Copy the code

In the example above, you want to find five lastnames starting with Smith among a very, very large number of people. If each step of the above procedures for pluck() and filter() produced a temporary array, that is, a looping, processing of the data returned from the previous step, the whole lookup process could take a long time.

Instead of using the above notation, for purely performance reasons, it can be handled like this:

var results = []; for (var i = 0; i < people.length; ++i) { var lastName = people[i].lastName; if (lastName.startsWith('Smith')) { results.push(value); if (results.length === 5) { break; }}}Copy the code

First, for raw data, only one loop is done, and all calculations are done in a single loop. Second, since only five final results are needed, the loop is terminated once five results are achieved.

This approach should result in significant performance gains for large arrays.

However, if you have to write this code by hand every time you encounter a situation like this for performance purposes, it is a bit cumbersome, and the code is not clear, easy to understand and maintain. The first example is better written, obviously easier to read, but there are some performance concerns.

So, wouldn’t it be nice if you could combine the writing in the first example with the performance improvement in the second example?

Now, lazy evaluation.

Lazy evaluation – wikipedia en.wikipedia.org/wiki/Lazy_e…

In programming language theory, lazy evaluation, or call-by-need is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations (sharing).

To use my term, lazy evaluation means that an expression does not evaluate a value when it is not needed, but only when it is.

JavaScript does not support such features at the language level, but rather through some technology to simulate and implement.

First of all, it cannot be an expression, which is evaluated immediately in JS. So, as in the first example, wrap the evaluation as a function that is called only when the value is required.

Then, latency calculations need to be implemented through “clever” designs and conventions. For the first example, the pluck(), filter(), take() methods, when called, do not evaluate the raw data directly, but rather defer the calculation until a method like value() is called.

Before we look at Lazy evaluation implementations of lazy.js, let’s summarize the goals discussed here with Lazy evaluation techniques:

  • Good code readability
  • Good code execution performance

For Lazy evaluation implementation in lazy.js, it can be summarized as:

  • Collecting computing requirements
  • Delay and optimize the execution of calculations

Analysis: Lazy evaluation implementation of lazy.js

Unlike Underscore and Lo-dash, lazy.js only provides chained, Lazy evaluation modes, which also makes Lazy evaluation implementations “pure.” The implementation of lazy.js is then analyzed.

Take a look at the code that uses lazy.js (from simply-lazy-demo) :

Lazy([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
  .map(i => i * 2)
  .filter(i => i <= 10)
  .take(3)
  .each(i => print(i))

// output:
// 2
// 4
// 6Copy the code

Note: Function definitions use ES6’s “=>” syntax for writing methods.

This is a slightly boring example of multiplying 1-10 by 2, filtering out values less than 10, and then printing out the first three results.

If you monitor the execution of the above code (see Proxy for callback function execution counts), you get the following results:





demo – simply-lazy

Both the map() and filter() procedures are executed only three times.

Focusing on the map() call, it is obvious that the calculation is not performed immediately, because the code does not foresee the filter() and take(3) that follow, so it is not smart enough to know that it only needs to perform three computations. If the calculation is not performed, what is being done here and what is being returned?

Lazy() returns an object of type Sequence, which contains the raw data. After the map() method is executed, a new Sequence object is generated that links to the previous Sequence object, records the calculation to be performed by the current step (I => I * 2), and then returns the new Sequence object. The subsequent filter() and take() procedures are similar.

The above code could also be written as:

var seq0 = Lazy([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) var seq1 = seq0. The map (I = > I * 2) var seq2 = seq1) filter (I = > I < = 10) var seq3 = seq2. Take (3) / / evaluated seq3. Each (I = > print(i))Copy the code

At the last evaluation, there is already a Sequence:

seq0 <- seq1 <- seq2 <- seq3Copy the code

None of the seQs on the chain performs the evaluation before SeQ3 calls the each() method to perform the evaluation. So how does that work? As in the first and second example, in a loop, starting with the last SEQ on the chain, the calculation of the current SEQ is applied before retrieving the value from the previous SEQ and returning it to the caller (that is, the next SEQ).

The process of obtaining the first final result value is as follows:

[1] Seq3 calls seq2 to get the first value [2] SeQ2 calls seq1 to get the first value [3] SeQ1 directly takes the original value corresponding to the source attribute of seq0 (the first value is 1) [4] SeQ1 gets 1, Apply the calculation (I => I * 2), get 2, return to the caller (seq2) [5] Seq2 gets 2, apply the calculation (I => I < 10), meet the condition, return 2 to the caller (seq3) [6] Seq3 gets 2, Apply the calculation (whether the number of returned values is less than 3) to satisfy the condition and return 2 to the caller (seq3.each)Copy the code

Finally, seq3.each() gets the first value (2), evaluates it (I => print(I)), and prints it. It then continues to get the next value until the call terminates in some procedure (in this case, when three values have been returned in the take() calculation).

In addition to map(), filter(), and take(), lazy.js provides more computing modes, but the Lazy calculation process is as follows:

  • Lazy() returns the initial Sequence object
  • The lazy calculation method returns a new Sequence object, internally recording the previous Sequence object and the current calculation logic
  • The evaluation method starts from the current Sequence object and obtains the value of the next Sequence object
  • Sequence objects apply their own calculation logic before returning the value obtained from the previous Sequence object to the next Sequence

The Sequence design of lazy. js makes the process of value and calculation form a long chain, and a single node on the chain only needs to complete the upload and transfer and apply its own calculation logic, which does not need to know the whole execution process. Each node performs its own duties and finally realizes lazy calculation.

Code can be more than a thousand words. Here’s a closer look at lazy.js Lazy computing by analyzing the simplified version of lazy.js (simply-lazy).

Implementation: a simplified version of lazy.js

Lazy.js supports data that can be evaluated, not just arrays, but also strings and objects. Simply-lazy only supports arrays, and only three lazy evaluation methods are supported:

  • map()
  • filter()
  • take()

Look at the implementation of these three methods separately.

(a) a map

The Sequence object directly returned by Lazy() is a special one. Unlike the other Sequence objects on the chain, it is already the root node that records its own raw data and contains no computational logic. So, iterating over this object is essentially iterating over the raw data, and it doesn’t involve lazy computing.

The simple-lazy retains the lazy.js naming, even though it only supports arrays, and gives this type ArrayWrapper:

function ArrayWrapper(source) {
  var seq = ArrayLikeSequence()
  seq.source = source
  seq.get = i => source[i]
  seq.length = () => source.length
  seq.each = fn => {
    var i = -1
    var len = source.length
    while (++i < len) {
      if (fn(source[i], i) === false) {
        return false
      }
    }
    return true
  }
  seq.map = mapFn => MappedArrayWrapper(seq, mapFn)
  seq.filter = filterFn => FilteredArrayWrapper(seq, filterFn)
  return seq
}Copy the code

In order to simplify the code, the lazy.js pattern of constructing different classes for different types of Sequence objects is not adopted. Sequence objects can be treated as ordinary objects, but need to support several interface methods by convention. Like ArrayWrapper() here, the returned SEQ object, even though it comes from ArrayLikeSequence(), already implements most of the interfaces itself.

The Lazy function returns an ArrayWrapper object like this:

function Lazy(source) {
  if (source instanceof Array) {
    return ArrayWrapper(source);
  }
  throw new Error('Sorry, only array is supported in simply-lazy.')
}Copy the code

If you evaluate between the objects returned by Lazy(), you can see that you are actually performing the process of traversing the raw data defined in ArrayWrapper.

Let’s look at seq.map(). Map () of ArrayWrapper returns a Sequence object of type MappedArrayWrapper:

function MappedArrayWrapper(parent, mapFn) { var source = parent.source var length = source.length var seq = ArrayLikeSequence() seq.parent = parent seq.get  = i => (i < 0 || i >= length) ? undefined : mapFn(source[i]) seq.length = () => length seq.each = fn => { var i = -1; while (++i < length) { if (fn(mapFn(source[i], i), i) === false) { return false } } return true } return seq }Copy the code

This is also a special Sequence (hence no Sequence in the name), because it knows that its upper-level Sequence object is the original Sequence object with no computational logic, so it can get the original data directly from parent. Source.

The evaluation is performed by applying the passed mapFn directly to the raw data.

It is up to you to apply map() to the resulting object if the other lazy methods are performed first. There are two related types of simply-lazy:

  • MappedSequence
  • IndexedMappedSequence

MappedSequence is a more general type, corresponding to the type of the Sequence obtained by map() :

function MappedSequence(parent, mapFn) {
  var seq = new Sequence()
  seq.getIterator = () => {
    var iterator = parent.getIterator()
    var index = -1
    return {
      current() { return mapFn(iterator.current(), index) },
      moveNext() {
        if (iterator.moveNext()) {
          ++index
          return true
        }
        return false
      }
    }
  }
  seq.each = fn => parent.each((e, i) => fn(mapFn(e, i), i))
  return seq
}Copy the code

The evaluation (each) is performed indirectly by calling each() of its parent Sequence. The getIterator() interface is also defined so that its descendant Sequence can get an iterator from it and iterate over the Sequence. Iterators are also a convention protocol in lazy.js, which implements objects that support both current() and moveNext() interface methods. From the iterator logic, it can be seen that when the MappedSequence object is the parent of other sequences, if “upload down” is implemented.

IndexedMappedSequence is simpler to implement, and its main functions come from inheritance:

  function IndexedMappedSequence(parent, mapFn) {
    var seq = ArrayLikeSequence()
    seq.parent = parent
    seq.get = (i) => {
      if (i < 0 || i >= parent.length()) {
        return undefined;
      }
      return mapFn(parent.get(i), i);
    }
    return seq
  }Copy the code

The IndexedMappedSequence provides only the ability to get elements at a specific index position; the rest is handled by ArrayLikeSequence.

ArrayLikeSequence, in turn, inherits from Sequence:

function ArrayLikeSequence() {
  var seq = new Sequence()
  seq.length = () => seq.parent.length()
  seq.map = mapFn => IndexedMappedSequence(seq, mapFn)
  seq.filter = filterFn => IndexedFilteredSequence(seq, filterFn)
  return seq
}Copy the code

The Sequence implements processing in a more general sense.

The map-related Sequence types described above have different implementations and are indeed somewhat convoluted. However, no matter how it is implemented, the core logic remains the same, which is to apply mapFn to the value of its superior Sequence and then return the resulting value to the subordinate Sequence for use. This is a specific mode of map computation.

(2) of the filter

The calculation mode of filter is different from that of Map. Filter applies filterFn to judge the value returned by the upper-level Sequence and then passes the value to the lower-level Sequence if the condition is satisfied. Otherwise, a new value is obtained from the upper-level Sequence for calculation. Until no value is found or one is found that satisfies the condition.

There are also multiple filter related Sequence types, and we will only look at one of them here, because the essence of the calculation mode is the same despite the different implementation methods:

function FilteredSequence(parent, filterFn) {
  var seq = new Sequence()
  seq.getIterator = () => {
    var iterator = parent.getIterator()
    var index = 0
    var value
    return {
      current() { return value },
      moveNext() {
        var _val
        while (iterator.moveNext()) {
          _val = iterator.current()
          if (filterFn(_val, index++)) {
            value = _val
            return true
          }
        }
        value = undefined
        return false
      }
    }
  }
  seq.each = fn => {
    var j = 0;
    return parent.each((e, i) => {
      if (filterFn(e, i)) {
        return fn(e, j++);
      }
    })
  }
  return seq
}Copy the code

The calculation mode of filter can be seen in getIterator() and each() of FilteredSequence. As mentioned above, the value of the upper-level Sequence can be judged and whether to return it to the lower-level Sequence or continue to obtain it according to the result.

No further elaboration.

(3) take

The calculation mode of take is to obtain the value from the upper-level Sequence and terminate the calculation when the specified number is reached:

function TakeSequence(parent, count) { var seq = new Sequence() seq.getIterator = () => { var iterator = parent.getIterator() var _count = count return { current() { return iterator.current() }, moveNext() { return (--_count >= 0) && iterator.moveNext() } } } seq.each = (fn) => { var _count = count var i = 0 var result parent.each(e => { if (i < count) { result = fn(e, i++); } if (i >= count) { return false; } return result }) return i === count && result ! == false } return seq }Copy the code

Just look at the TakeSequence type, which is similar to the FilteredSequence in that its calculation mode is extracted in both getIterator() and each(). Once a specified number of values are obtained, the calculation is terminated (via return false).

conclusion

Simply-lazy implements lazy.js’s three lazy calculation methods (map, filter, take), but the design pattern of lazy.js can be seen from them. Different calculation methods reflect different calculation modes, which are realized by different Sequence types.

A specific Sequence type contains a specific calculation pattern, as can be seen from its type name. For example, MappedArrayWrapper, MappedSequence, and IndexedMappedSequence are map calculation modes.

The process of evaluation, or the mode of evaluation, is unified, which is based on the Sequence chain. Each Sequence node in the chain is only responsible for:

  • Upload: Obtain the value from the upper-level Sequence
  • Transfer: Transfer to a subordinate Sequence
  • Evaluation: Applies the calculation mode corresponding to the type to calculate or filter data

Lazy evaluation is achieved by a chain of sequences with different computation modes embedded.

Of course, this is just an implementation of Lazy evaluation in lazy.js, and it’s no surprise that Lazy evaluation is equal to the implementation here, or that Lazy evaluation is all about lazy.js here. Moreover, simply-lazy, which this article focuses on, is only a modal description of lazy evaluation of lazy.js and does not contain all of lazy.js’s features (such as generating infinite lists).

After all, it will give you something of value. I’d be glad if it was just a little.

The Lazy evaluation of lazy.js is just my opinion.

Finally, thanks for reading!