Why write this article

Every once in a while a similar topic comes up in nuggets, such as interviews, handwriting functions, and this topic is no exception — array de-duplication. I have seen the forum appeared again and again of the array to redo the article, but personally feel that may not be too perfect, so, I expect to try to JS array to redo a “systematic” summary.

data

  • Four JavaScript equality rules
  • The default deduplication function of LodashJS
  • Lodashjs removes the value comparison rules adopted by functions
  • The Map of ES6

The premise

The test case

This use case contains all the simple basic types and objects, and can cover almost all the common data possibilities

let a = {};
let b = { a: 1 };
let testArr = [1.1."1"."1"."a"."a", {}, {}, { a: 1 }, a, a, b, [], [], [1].undefined.undefined.null.null.NaN.NaN];
Copy the code

To heavy standard

The following code is based on the result of the default deduplication function in LoDash, and the result of the uniq function in LoDash

/ / [email protected]
_.uniq(testArr);
// [1, "1", "a", {}, {}, { a: 1 }, {}, { a: 1 }, [], [], [1], undefined, null, NaN];
Copy the code

Go heavy 4 big types

For JS array deduplication, in fact, all changes are part of it. I simply summarize the following four types of deduplication, and around various types, due to the rich functions and tool capabilities of JS, there can be a variety of implementation methods, whether it is using forEach,reduce,filter, etc., to cycle and assemble new arrays, Loop through and assemble new arrays using the basic for loop and push loop, which can be summarized into the following four types for now

Array element comparison type

This method takes the value of the array and compares it with other values and modifies the array

Double-layer for loop

Take an element and compare it with all subsequent elements. If an element is found equal after it, delete it

function uniq(arr) {
    for (let i = 0; i < arr.length; i++) {
        for (let j = i + 1; j < arr.length; j++) {
            if (arr[i] === arr[j]) {
                arr.splice(j, 1); j--; }}}return arr;
}

// Run the result
// [1, "1", "a", {}, {}, { a: 1 }, {}, { a: 1 }, [], [], [1], undefined, null, NaN, NaN]
// NaN is not removed compared to the lodash result
Copy the code

Since NaN === NaN equals false, repeated nans are not removed

Sort and compare adjacent

If the array does not have complex objects (complex objects are difficult to sort), we can try this method

function uniq(arr) {
    arr.sort();
    for (let i = 0; i < arr.length - 1; i++) {
        arr[i] === arr[i + 1] && arr.splice(i + 1.1) && i--;
    }
    return arr;
}
// Run the result
//[[], [], 1, "1", [1], NaN, NaN, {}, {}, { a: 1 }, {}, { a: 1 }, "a", null, undefined];
// NaN is not removed compared to the lodash result, and the object is de-duplicated
Copy the code

Also because NaN === NaN equals false, duplicate nans are not removed, and because sort does not order objects well, there are some de-repetition failures in the object section

Find the array element position type

This type searches for the first occurrence position of each element in the array. If the first occurrence position is equal to the index of the element at that time, the element is collected

indexOf

IndexOf is used to find the first occurrence position of an element in an array, or if the position is equal to the current element’s position

function uniq(arr) {
    let res = [];
    for (let i = 0; i < arr.length; i++) {
        if(arr.indexOf(arr[i]) === i) { res.push(arr[i]); }}return res;
}
// Run the result
// [1, "1", "a", {}, {}, { a: 1 }, {}, { a: 1 }, [], [], [1], undefined, null];
// NaN is missing compared to the lodash result
Copy the code

IndexOf uses the same value equality rule as ===, so no element in the array is equal to NaN, including itself, so no NaN is left behind

findIndex

Use the findIndex method to find the first occurrence of an element in an array

function uniq(arr) {
    let res = [];
    for (let i = 0; i < arr.length; i++) {
        if (arr.findIndex(item= >item === arr[i]) === i) { res.push(arr[i]); }}return res;
}
// Run the result
// [1, "1", "a", {}, {}, { a: 1 }, {}, { a: 1 }, [], [], [1], undefined, null];
// NaN is missing compared to the lodash result
Copy the code

The resulting principle is the same as indexOf, since the === rule is used to determine whether elements are equal, but this method is equivalent to a double-layer for loop

Finds if the element has type

This method basically relies on the includes method to determine whether the corresponding element exists in the new array, if not, the collection

function uniq(arr) {
    let res = [];
    for (let i = 0; i < arr.length; i++) {
        if (!res.includes(arr[i])) {
            res.push(arr[i]);
        }
    }
    return res;
}
// Run the result
// [1, "1", "a", {}, {}, { a: 1 }, {}, { a: 1 }, [], [], [1], undefined, null, NaN];
// Same result as lodash
Copy the code

The includes method uses the SameValueZero judgment rule, so NaN can be determined and reweighted

Depending on the data type characteristic type

This scheme relies on the non – duplication property of data types to achieve the effect of deduplication

Map

Data of Map type can be similar to that of Object. When setting key-value pairs of elements, the uniqueness of keys can be guaranteed, and the type of keys can be extended to all basic elements, including objects. After setting a Map data type of unique keys, Keys () = map.prototype.keys () = map.prototype.keys (

function uniq(arr) {
    let map = new Map(a);for (let i = 0; i < arr.length; i++) { ! map.has(arr[i]) && map.set(arr[i],true);
    }
    return [...map.keys()];
}
// Run the result
// [1, "1", "a", {}, {}, { a: 1 }, {}, { a: 1 }, [], [], [1], undefined, null, NaN];
// Same result as lodash
Copy the code

Set

Similar to Map, it is the most popular method to use data type properties to accomplish deduplication

function uniq(arr) {
    return [...new Set(arr)];
}
// Run the result
// [1, "1", "a", {}, {}, { a: 1 }, {}, { a: 1 }, [], [], [1], undefined, null, NaN];
// Same result as lodash
Copy the code

The performance test

The test case

Here we simply (and loosely) splice the test example we just used 1 million times

let a = {};
let b = { a: 1 };
let testArr = [1.1."1"."1"."a"."a", {}, {}, { a: 1 }, a, a, b, [], [], [1].undefined.undefined.null.null.NaN.NaN];
let arr = [];
for (let i = 0; i < 1000000; i++) { arr.push(... testArr); } uniq(arr);Copy the code

Starting with the results, indexOf’s solution was the fastest in a simple test case with about 20 million data points

I also wrote two small tools before, I hope to help you to learn and progress together.

  • A regular 15 minute crawl of nuggets trending articles – Nuggets trending
  • Nuggets of advanced search tools – nuggets of search
  • Nuggets search introduction