A while back lucifer was asked a question: “What is immutableJS? What’s the use? “. My answer at the time was: ImMutableJS is tree + Sharing, which solves the problem of data variability and incidentally optimizes performance. Today I will explain this sentence in detail.

background

Let’s do this with an example. Here are a few plain plain assignment statements:

a = 1;
b = 2;
c = 3;
d = {
  name: "lucifer". age: 17. location: "West lake".}; e = ["Front end of imagination"."Force buckle plus plus."]; Copy the code

The memory structure of the above code looks something like this:


Note: The variable names (a, b, c, d, e) are just aliases for memory addresses

Since d and e “values” are reference types, the length of the data is uncertain, so the data region actually points to a region on the heap. A, B, and C can be easily stored on the stack because their lengths are determined at compile time.

Lucifer tip: The data length of D and E is uncertain, but the length of the pointer is certain. Therefore, the pointer can be stored on the stack, and the pointer points to the memory on the heap.

In practice, we often do various assignments, such as:

const ca = a;
const cb = b;
const cc = c;
const cd = d;
const ce = e;
Copy the code

After the above operation, the memory structure diagram at this time:


For ca, CB, CC, CD, ce, the memory address has changed, but the value has not changed. The reason is that variable names are just memory aliases, and assignment is passing value.

Since JS object operations are mutable, such bugs are possible:

cd.name = "azl397985856";
console.log(cd.name); // azl397985856
console.log(d.name); // azl397985856
Copy the code

Cd.name above “modifies” the name value of CD in place, which affects all references to TA.

For example, if an object is referenced by three Pointers, all three Pointers will be affected if the object is modified.


You can think of Pointers as threads and objects as process resources that are shared by threads. Multiple Pointers are multithreading, and problems can arise when multiple threads read or write to an object at the same time.

Instead, many people prefer to copy (shallow or deep). In this way, objects with multiple Pointers are different and can be regarded as multi-processes.

Next we do a copy operation.

const sa = a;
const sb = b;
const sc = c;
constsd = { ... d };const se = [...e];
 // Some people still don't like it const sxbk = JSON.parse(JSON.stringify(e)); Copy the code

Bystander: Why do you have so many copies of code? Client: I don’t know why I have to copy it, but it makes me feel at ease.

All values of the reference type change, and the memory map looks like this:


The above “bug” was resolved successfully.

Lucifer tip: If you use a shallow copy, the inner value of the object does not change. There will also be “bugs” if you do things like A.B.C on inner objects at this point.

Complete memory map:


(If you can’t see clearly, try zooming in)

The problem

Shallow copy is fine because you only copy one layer, but performance degrades significantly as you add keys.

According to the measurements:

  • Shallow Copy An object with 1W attributes takes about 10 ms.
  • Deep copy a three-tier object with 1W attributes takes about 50 ms.

Immutablejs can help reduce this time (and memory) overhead, which we’ll cover later.

The data is for reference only, you can also use your own project to measure.

Since it is difficult for normal projects to achieve this level of magnitude, the basic conclusion is that if your project objects are not very large, there is no need to consider optimization such as IMMutableJS, just manually copy immutable.

What if my project is really big? Consider using the IMmutable library to help you. Immutablejs is one of countless IMmutable libraries. Let’s take a look at how ImMutableJS solves this performance challenge.

What is immutablejs

Using the API provided by ImMutableJS to manipulate data, each operation returns a new reference, which is similar to Deep Copy but performs better.

Immutablejs is tree + Sharing, which solves the problem of data variability and also provides performance. The tree here is a tree similar to the Trie. If you are not familiar with trie, take a look at my previous topic on prefix trees [1].

Immutablejs is “structure sharing” implemented through trees. Here’s an example:

const words = ["lucif"."luck"];
Copy the code

I build a prefix tree based on words, nodes don’t store data, data is stored on paths. The header node represents the reference address of the object.


So we put two words lucif and luck on the tree:


Now I want to change Lucif to Lucie. The normal way is to make a full copy and then modify it.

newWords = [...words];
newWords[1] = "lucie";
Copy the code

(Note that the whole tree is new, and the memory address of the root node has changed.)

State sharing is:

(Note that the entire tree is old except for a new node. You can see that the memory address of the root node is unchanged.)

As you can see, we “just added a node and changed a pointer, nothing else changed, this is called structure sharing.”

There’s still a problem

A closer look reveals that using our method, words and newWords references are equal (both 1fe2ab), i.e. Words === newWords.

So we need to follow the path back to the root node and modify all nodes along the way (green). In this example, we modify only two less nodes. But as the number of nodes in the tree increases, so does the common prefix, and then the performance improvement becomes apparent.


The whole process is similar to the GIF below:


Trade-offs between

I mentioned earlier that you can trace the path back to the root node and modify all nodes along the way. Because the total number of nodes in the tree is fixed, when the tree is very high, the number of child nodes of a node will be very small, and the node reuse rate will be very low. Imagine an extreme case where all nodes in the tree have only one child node and then degenerate to a linked list with a time complexity O(P) for each modification, where P is the number of ancestor nodes. If you are modifying leaf nodes, then P is equal to N, where N is the total number of nodes in the tree.

If the tree is very short, the number of child nodes increases, so the number of Pointers that need to be modified increases with each backtrace. In the case of four child nodes, two more Pointers need to be created than the two child nodes above.


Imagine an extreme case where the tree has only one layer. Let’s change lucif to Lucie. We can only create a new Lucie node from scratch, we can’t take advantage of the existing node, there is no optimization compared to deep Copy.


Therefore, it is difficult to choose the number of forks of a tree reasonably, and it is definitely not a simple binary tree. This option often requires a lot of experimentation to arrive at a relatively reasonable value.

React

One of the biggest differences between React and Vue is that React is more immutable. React favors immutable data, while Vue does the opposite. If you happen to use both frameworks, you know what I mean.

One advantage of using immutable is that “future operations do not affect previously created objects.” So you can easily persist your application’s data to the back end for debugging analysis or time travel (thanks to predictable one-way data flow).

In combination with state management frameworks such as Redux, IMMutableJS can play a bigger role. At this point, your entire state tree should be an ImMutableJS object, not a normal JavaScript object, and the operation should be performed using the API provided by ImMutableJS. And thanks to imMutableJS, we can easily use congruence === judgment. It’s also much easier to write the SCU.

SCU is short for shouldComponentUpdate.


In my years of experience, using a library like ImMutableJS leads to erratic performance improvements. And because of the addition of a library, the cost of debugging is more or less increased, and there is a certain cost of understanding and getting started. Therefore, my suggestion is that we should learn the technology first. If the project really needs to use it and the team members can Cover the technology, then we should not be late to access it and not optimize it too early.

conclusion

Due to data variability, when multiple Pointers point to the same reference and one of the Pointers modifies the data, it can cause “weird” effects. This becomes more common as the project grows in size. And because future operations may modify the previously created object, the state at any point in the middle cannot be retrieved, thus missing the intermediate link and making it difficult to debug. Immutable data means that “future operations will not affect previously created objects”, which reduces the “inscrutable” phenomenon and makes debugging easier because we can know any intermediate state.

Manual implementation of “data immutable” will handle most cases. In extreme cases, there are performance problems. Immutablejs is tree + Sharing, which solves the problem of data variability and incidentally optimizes performance. Not only does it solve the performance problem of manual copy, but it can also compare an object over time to see if it has changed. So a React SCU optimized for React would be nice.

Finally, I recommend two other IMmutable libraries, seamless-immutable[2] and Immer[3].

Pay attention to my

You can also follow my public account “Imagination Front” to get more and fresher front-end hardcore articles, introducing you to the front end you don’t know.


Lucifer – Zhihu [4]

Pay attention, don’t get lost!

Reference

[1]

The prefix tree project: https://github.com/azl397985856/leetcode/blob/master/thinkings/trie.md


[2]

seamless-immutable: https://github.com/rtfeldman/seamless-immutable


[3]

Immer: https://github.com/immerjs/immer


[4]

Lucifer – zhihu: https://www.zhihu.com/people/lu-xiao-13-70