For code to run on a computer, it needs a set of computer resources: memory, network ports, open files, and so on. Together, these resources are called processes.

A process has a dedicated control block to record these resources, called a process control block (PCB).

The most important of these resources is memory, which is requested from the operating system when a process is started.

If the memory is unlimited, then we put data, code, etc., don’t worry about running out of memory, but unfortunately memory is limited, we need to recycle the unused memory in time, put something else, so that the code can run properly.

Memory is divided into code area, global data area, heap area, stack area, and so on. This is the memory model of the operating system executable file. If it is javascript, Java, such an interpreted language, it will also do some of its own division. But in general, it’s divided into these parts.

The content of the code area is basically unchanged.

The stack area stores local variables declared with the function call. Each function has a stack frame. It has an upper limit, and if the call level is too deep, the stack will overflow.

The global data area holds global variables.

Large objects in the stack and global data areas are stored on the heap, leaving only one reference.

The heap area stores dynamically allocated large objects and occupies the most memory. Our memory management is also mainly to manage the heap memory.

Different languages have different ways of managing heap memory, and they’re all smarter than each other. Let’s see who’s smarter:

C, C + +

C, C++ memory is manually managed by the programmer, such as C++ class constructor and destructor, constructor to apply for memory, destructor to release the memory.

It’s up to the programmer, depending on the level of the programmer.

Tencent used C++ to do server-side development on a large scale before, but later also gradually turned to go, Java, because C++ this way of manual memory management, in case a programmer missed some memory did not release, it is a memory leak. (A memory leak is when unused memory is held up all the time, resulting in a decrease in available memory), and the server is running for a long time, so a small memory leak can accumulate and eventually cause the process to crash.

It’s hard for programmers to make sure they free up unused memory. It would be nice if the program could recycle the garbage memory itself, freeing the programmer and making the code more reliable. As a result, later high-level languages almost all have automatic garbage collection mechanisms.

Java, javascript,

The manual way of managing memory in c++ is too cumbersome, so Java and javascript were designed to prevent programmers from manipulating memory. Instead, they created a garbage collection mechanism to periodically free up unused memory.

How do you detect which memory is useless? The original idea is to keep a count of references to each object, and if it’s not referenced, it can be reclaimed. This idea is called reference counting.

The problem with this idea is that if two objects refer to each other and neither object is referred to by another object, this kind of circular reference problem will not be detected.

That’s not smart enough. How do you optimize it?

Starting with global objects, mark all referenced objects and eliminate any that are not marked. This allows you to check for things that are not referenced or referenced in a loop but are not referenced by another object, an idea called tag clearing.

The idea of tag clearing is more clever, so today’s JS engines basically use this idea.

There is a problem with this approach to memory management: if an unused object is placed globally, it will never be reclaimed. This also leaks memory.

This can only be checked by the programmer, through the tool to find some variables that should not be placed in the global.

Js memory leak detection is usually done by using the Chrome DevTools memory tool. It can take a memory snapshot at a certain point in time, perform some operations, and then take a memory snapshot again. By comparing the two memory snapshots, it can find out which global variables are added. Then locate the memory leak code.

For example, this code:

After 5s, declare a variable aaa globally, which is regular expression type.

We took two separate snapshots using the Memory tool of Chrome DevTools.

There are different views and we chose the comparison view to compare the two snapshots:

You can see the delta column, which shows the regular expression object + 1, which is the global variable declared in our timer.

This memory snapshot comparison allows you to locate what operations are causing memory leaks and, in turn, code.

Automatic garbage collection avoids leaks caused by programmers not releasing some memory, but there are still leaks caused by putting useless objects globally. This is clever, but also problematic.

rust

Rust also does not require a programmer to manage memory manually, but it does not have garbage collection, manages memory better, and avoids 99% of memory leaks. How does it do that?

Rust believes that objects in the heap are difficult to manage because they are referred to in too many places. If an object is restricted to belonging to a particular function and having only one reference, the other references can be copied, so that all objects in the heap used can be recycled at the end of a function call, leaving no garbage behind. This idea is called the ownership mechanism.

The ownership mechanism manages memory without the need for garbage collectors by limiting references to objects. It also doesn’t have the js memory leak problem of accidentally placing objects globally.

Rust’s ownership mechanism is a smarter way to manage memory, and for this reason rust is becoming increasingly popular.

conclusion

The available memory of a process is limited, so it is necessary to release the memory of variables that are no longer needed. Different languages have different ways of managing memory, with varying degrees of intelligence:

C, C ++ is to rely on programmers to manage their own memory, ten thousand accidentally a memory did not release the leak.

Java and javascript, on the other hand, are not left to programmers to manage. There are special garbage collectors, first through reference counting, and later changed to tag cleaning, which is a way to find unused memory and free it.

If an object is stored globally, it cannot be recovered. This is a memory leak. You need to use the Memory tool of Chrome DevTools to record two snapshots, and then do a diff to locate the code that caused the memory leak.

Rust also doesn’t require a programmer to manage memory manually, but it also doesn’t have a garbage collector. It limits objects to a single reference, so they can be reclaimed at the end of a function call, leaving no garbage behind, and it avoids memory leaks like putting useless objects globally (because only one reference is allowed).

Languages evolve in such a way that they allow programmers to do less and make programs more robust. This requires smarter language design and more powerful compilers/interpreters.