preface

I always hear people say what to read Vue, React… XXX source code, but it’s really hard to read the source code of these frameworks, and there’s a lot of algorithms, design patterns, and data structures involved in these good frameworks. At first read so difficult source code, you really read into it? In the absence of a certain basis, blindly reading the source code of various frameworks will only wear people’s willpower. When I was a child to read the teacher said, learning knowledge to step by step, the examination to do when the difficult first. Well, reading source code is the same thing. We should also start with the easier things before the harder ones. I have always heard of Lodash, but to be honest, I have never looked at the source code of Lodash. Recently I did, and the source code of Lodash is amazing! The extreme details, the various boundary cases, the performance considerations are all worth learning. So I decided to take a close look at lodash’s source code and write a few essays documenting the learning process.

Without further ado, the first lodash method I read was the difference method

Difference function analysis

Creates an array with unique array values, each of which is not contained in any other given array. (Note: create a new array whose values exclude the values in the given array for the first number (array parameter).) This method uses SameValueZero for equality comparison. The order of the resulting values is determined by the order in the first array.

The source code

function difference(array, ... values) {
  return isArrayLikeObject(array)
    ? baseDifference(array, baseFlatten(values, 1, isArrayLikeObject, true)) : []}Copy the code

The lodash code is highly self-annotated, and you can probably read the author’s intent based on the name of the function. Difference itself means:

Is array a class array? If it is a class array, then… If it is not a baseDifference, return an empty array.

The lodash source code is very simple. The methods exposed by Lodash are not complicated. They are all made up of basic functions like isArrayLike, baseDifference, baseFlatten, etc. On the one hand, lodash also has a high level of reusability and abstraction inside. And then we’re going to take it from there.

The whole

isArrayLikeObject

It literally means “Is it an array-like object?”

Array-like: An array-like is not really an array, but an array-like object. An array-like object should conform to two things:

  1. Use a number as the property name
  2. The length attribute is required
function isArrayLikeObject(value) {
  return isObjectLike(value) && isArrayLike(value)
}
Copy the code

This function is also simple, if a value is an object and an array-like object then the value is an array-like object. Let’s take a quick look at the source code for isObjectLike and isArrayLike

function isObjectLike(value) {
  return typeof value === 'object'&& value ! = =null
}

function isArrayLike(value) {
  returnvalue ! =null && typeofvalue ! = ='function' && isLength(value.length)
}
Copy the code

IsObjectLike involves an old JavaScript BUG where typeof null values are object(why is typeof(null) value “object” in JavaScript?)

IsArrayLike excludes the case where value is a function. Notice the isLength method.

const MAX_SAFE_INTEGER = 9007199254740991
function isLength(value) {
  return typeof value === 'number' &&
    value > -1 && value % 1= =0 && value <= MAX_SAFE_INTEGER
}
Copy the code

The isLength limits the value to be an integer and less than the maximum safe integer. At this point, the magic number 9007199254740991 succeeded in attracting my attention. Where did this number come from? 9007199254740991 is exactly 2 times 53-1. Why is it equal to this number? Simply because js is the use of IEEE754 standard unified representation of floating point numbers and integers. Js does not fully use all bits of a 64-bit floating point number to represent integers.

MAX_VALUE and MAX_SAFE_INTEGER in JS are the same as those in JS

OK, to this isArrayLikeObject source code has been analyzed almost, just read this little bit of code has to let people sigh loDash is really full details pull.

baseFlatten

Literally, this is a function to expand a multidimensional object/array.

function baseFlatten(array, depth, predicate, isStrict, result) {
  predicate || (predicate = isFlattenable)
  result || (result = [])

  if (array == null) {
    return result
  }

  for (const value of array) {
    if (depth > 0 && predicate(value)) {
      if (depth > 1) {
      	// There is a risk of stack overflow
        // Recursively flatten arrays (susceptible to call stack limits).
        baseFlatten(value, depth - 1, predicate, isStrict, result)
      } else{ result.push(... value) } }else if(! isStrict) { result[result.length] = value } }return result
}
Copy the code

What do you mean by arguments to a function

  • Array: indicates the array to be expanded
  • Depth: How many levels do I need to expand?
  • Predicate: A function that “checks” for each item in an array
  • IsStrict: Specifies whether each item must pass the predicate function check
  • Result: The array into which to put the expanded result

The above code simply iterates over the array, recursing if the level of expansion is greater than one, otherwise using… The array expander expands. Also note that in strict mode, values that do not pass the predicate function are not put into the result array.

baseDifference

Finally, let’s look at the baseDifference function

import SetCache from './SetCache.js'
import arrayIncludes from './arrayIncludes.js'
import arrayIncludesWith from './arrayIncludesWith.js'
import map from '.. /map.js'
import cacheHas from './cacheHas.js'
const LARGE_ARRAY_SIZE = 200

function baseDifference(array, values, iteratee, comparator) {

  // includes changes depending on the usage
  let includes = arrayIncludes
  let isCommon = true
  const result = []
  const valuesLength = values.length

  if(! array.length) {return result
  }
  if (iteratee) {
    values = map(values, (value) = > iteratee(value))
  }
  // arrayIncludesWith if there is a Comparator
  if (comparator) {
    includes = arrayIncludesWith
    isCommon = false
  }
  else if (values.length >= LARGE_ARRAY_SIZE) {
    includes = cacheHas
    isCommon = false
    values = new SetCache(values)
  }
  The above part is about adjusting some of the functions used and changing some of the identifiers for different situations
  
  outer:
  for (let value of array) {
    const computed = iteratee == null? value : iteratee(value) value = (comparator || value ! = =0)? value :0
    if (isCommon && computed === computed) {
      let valuesIndex = valuesLength
      while (valuesIndex--) {
        if (values[valuesIndex] === computed) {
          continue outer
        }
      }
      result.push(value)
    }
    else if(! includes(values, computed, comparator)) { result.push(value) } }return result
}
Copy the code

Let’s start with the argument list of the function:

  • Array: indicates the array to be checked
  • Values: The array to be compared with array
  • Iteratee: The iteratee function is called for each element as the array is traversed
  • Comparator: A function that determines whether two values are equal

Let’s read through the logic and then dig into the details

The above part of the for loop is to adjust some of the functions used and change some of the identifiers for different situations. The part of the for loop iterates through the array array, then uses a loop to compare the values in the values array and, if different, to put them in the Result array.

The general logic is fairly simple, just a double loop. The important thing to note is the comparator and LARGE_SIZE sections. When we pass in the Comparator or values array with a size greater than 200, isCommon === false, this is a little different.

comparator

If we provide a Comparator, the includes function will change to arrayIncludesWith

ArrayIncludes source code

function arrayIncludes(array, value) {
  const length = array == null ? 0 : array.length
  return!!!!! length && baseIndexOf(array, value,0) > -1
}

function baseIndexOf(array, value, fromIndex) {
  // NaN ! == NaN
  return value === value
    ? strictIndexOf(array, value, fromIndex)
    : baseFindIndex(array, baseIsNaN, fromIndex)
}

function strictIndexOf(array, value, fromIndex) {
  let index = fromIndex - 1
  const { length } = array

  while (++index < length) {
    if (array[index] === value) {
      return index
    }
  }
  return -1
}

function baseFindIndex(array, predicate, fromIndex, fromRight) {
  const { length } = array
  let index = fromIndex + (fromRight ? 1 : -1)

  while ((fromRight ? index-- : ++index < length)) {
    if (predicate(array[index], index, array)) {
      return index
    }
  }
  return -1
}


function baseIsNaN(value) {
  returnvalue ! == value }Copy the code
ArrayIncludesWith source code
function arrayIncludesWith(array, target, comparator) {
  if (array == null) {
    return false
  }

  for (const value of array) {
    if (comparator(target, value)) {
      return true}}return false
}
Copy the code

ArrayIncludesWith and arrayIncludes have the same function to determine whether a value is included in an array, but arrayIncludesWith requires us to provide a comparator function to compare the two values for equality. The arrayIncludes function provided in Lodash is also used to fill out details, taking full account of boundary conditions such as empty arrays and NaN values.

LARGE_SIZE

When the size of our values array exceeds LARGE_SIZE, an includes will become a cacheHas and the values array will become a SetHash object. Now let’s look at how cacheHas and SetHash objects work.

cacheHas

function cacheHas(cache, key) {
  return cache.has(key)
}
Copy the code

The source code is very simple, the cache object has a has method that returns whether or not the key value exists in the object.

SetCache

The SetCache source code is relatively more, here is another article to introduce the SetCache object. You can think of it here as a Set object provided in ES6.

Having said all that, why use a Set when the values are large? So let’s go back to how do we know that there are different values in two arrays? We use a double loop, time complexity is O(n^2). For objects like Set, the time complexity of the query data is O(1). When we add the outermost loop, our time complexity becomes O(n).

If our array is small, the overhead of creating the SetCache object may be greater than the performance degradation caused by the double loop, but if our array is large, the performance problem caused by the double loop is much greater than the overhead of creating the SetCache.

conclusion

At this point, the whole difference method source reading also basically come to an end. Let’s summarize some of the highlights and details we learned:

  1. typeof null === 'object'This is a BUG in the JS design. (233333333).
  2. NaN ! == NaNIt’s important to note that this is also determining whether a value isNaNMethods.
  3. In JS, ieEE-754 standard is used to express integer number and floating point number uniformly, so the maximum safe number in JS is 2^ 53-1.
  4. In the case of large arrays, using hash table for array query is faster than double loop because the time complexity of hash table query is O(1).

If you find the article helpful, you can click “like” before you go ~