A few weeks ago, we started a series that aims to take a closer look at JavaScript and how it actually works: We think that by understanding the building blocks of JavaScript and how they work together, you’ll be able to write better code and applications.

The first article in this series focused on an overview of the engine, runtime, and call stack. The second article takes a closer look at the insides of Google’s V8 JavaScript engine and also offers some advice on how to write better JavaScript code.

In this third article, we’ll discuss another important topic, memory management, that has been increasingly neglected by developers due to the increasing sophistication and complexity of the programming languages we use everyday. We’ll also provide some tips on how to handle memory leaks in JavaScript in the SessionStack, because we need to make sure that the SessionStack doesn’t cause memory leaks or increase memory consumption for the web application integrated into it.

An overview of the

Languages like C have low-level memory management primitives such as malloc() and free(). Developers use these primitives to explicitly allocate and free memory between operating systems.

Meanwhile, JavaScript allocates memory when objects (objects, strings, and so on) are created and “automatically” frees memory when it is no longer used, a process known as garbage collection. This seemingly “automatic” release of resources is a source of confusion and gives JavaScript(and other high-level languages) developers the false impression that they can choose not to care about memory management. This is a big mistake.

Even when using high-level languages, developers should understand memory management (or at least understand the basics). Sometimes there are problems with automatic memory management (such as bugs or implementation limitations in the garbage collector) that developers must understand in order to deal with correctly (or find an appropriate alternative to achieve minimal compromises and code changes).

Memory life cycle

No matter which programming language you use, the memory life cycle is almost always the same:

Here’s an overview of what happens at each step of the loop:

  • Allocate Memory – Memory is allocated by the operating system and allowed to be used by your programs. In low-level languages such as C, this is an explicit operation that you should handle as a developer. In a high-level language, however, this is taken care of for you.

  • Use memory – This is how long the program actually uses the memory allocated before. When you use assigned variables in your code, read and write operations are performed.

  • Free up Memory – Now it’s time to free up the entire memory you don’t need so it can become free and available again. Like the allocate memory operation, this operation is explicit in low-level languages.

For a quick understanding of the concepts of call stacks and memory heaps, read our first article on this topic.

What is memory?

Before jumping straight to memory in JavaScript, we’ll briefly discuss what memory in general is and how it works.

At the hardware level, computer memory consists of a large number of triggers. Each flip-flop contains several transistors and is capable of storing one bit. Individual triggers can be addressed by unique identifiers, so they can be read or overwritten. Thus, conceptually, we can think of the entire computer memory as a huge array of bits that can be read and written.

Because as humans, we’re not very good at putting all of our thinking and arithmetic into bits, we organize them into larger groups that can be used to represent numbers. Eight bits are called a byte. In addition to bytes, there are words (sometimes 16 bits, sometimes 32 bits).

A lot of things are stored in memory:

  1. All variables and other data used by all programs.

  2. Program code, including operating system code.

The compiler works with the operating system to handle most of the memory management for you, but we recommend that you take a look at what’s really going on underneath.

When compiling code, the compiler can examine the basic data types and calculate in advance how much memory they require. The required amount is then allocated to the program in the call stack space. The space allocated for these variables is called stack space, because their memory is added to existing memory when a function is called. When they terminate, they are removed in last in, first out (LIFO) order. For example, consider the following statement:

int n; // 4 bytes
int x[4]; // array of 4 elements, each 4 bytes
double m; // 8 bytes
Copy the code

The compiler can immediately see that the code needs: 4 + 4×4 + 8 = 28 bytes.

This is how it handles the current size of integers and double-precision floating-point numbers. About 20 years ago, integers were usually 2 bytes and double 4 bytes. Your code should not depend on the size of the base data type.

The compiler inserts code that interacts with the operating system to request the number of stack bytes needed to store the variable.

In the example above, the compiler knows the exact memory address of each variable. In fact, every time we write variable N, it gets internally converted to something like “memory address 4127963”.

Note that if we try to access x[4] here, we will access data related to M. This is because we are accessing a non-existent element in the array — it is 4 bytes more than the last actually allocated element x[3] in the array, and may read (or overwrite) some bits of M. This will almost certainly produce highly undesirable results for the rest of the program.

When a function calls another function, each function gets its own stack block when called. It retains all local variables, but also has a program counter that remembers where it was executed. When the function completes, its memory block can be used again for other purposes.

Dynamic allocation

Unfortunately, things are not so easy when we don’t know how much memory a variable requires at compile time. Suppose we want to do the following:

int n = readInput(); // reads input from the user.// create an array with "n" elements
Copy the code

Here, at compile time, the compiler does not know how much memory the array requires, because it is determined by the user-supplied value.

Therefore, it cannot allocate space for variables on the stack. Instead, our program needs to explicitly request the appropriate space from the operating system at run time. This memory is allocated from heap space. The differences between static and dynamic memory allocation are shown in the following table:

Memory allocation in JavaScript

Now, we’ll explain how the first step (allocating memory) works in JavaScript.

JavaScript frees the developer from the responsibility of handling memory allocation — in addition to declaring variables, JavaScript handles memory allocation itself.

var n = 374; // allocates memory for a number
var s = 'sessionstack'; // allocates memory for a string 
var o = {
  a: 1.b: null
}; // allocates memory for an object and its contained values
var a = [1.null.'str'];  // (like object) allocates memory for the
                           // array and its contained values
function f(a) {
  return a + 3;
} // allocates a function (which is a callable object)
// function expressions also allocate an object
someElement.addEventListener('click'.function() {
  someElement.style.backgroundColor = 'blue';
}, false);
Copy the code

Some function calls also cause an object to be allocated:

var d = new Date(a);// allocates a Date object
var e = document.createElement('div'); // allocates a DOM element
Copy the code

Some methods can also assign new values or objects:

var s1 = 'sessionstack';
var s2 = s1.substr(0.3); // s2 is a new string
// Since strings are immutable, 
// JavaScript may decide to not allocate memory, 
// but just store the [0, 3] range.
var a1 = ['str1'.'str2'];
var a2 = ['str3'.'str4'];
var a3 = a1.concat(a2); 
// new array with 4 elements being
// the concatenation of a1 and a2 elements
Copy the code

Use memory in JavaScript

Basically, using allocated memory in JavaScript means reading and writing to it.

Allocating memory can be done by reading or writing values of variables or object attributes, or even passing parameters to functions.

Release memory when it is no longer needed

Most memory management problems occur at this stage.

The most difficult task here is to determine when allocated memory is no longer needed. It typically requires the developer to determine which blocks of memory in the program are no longer needed and release them.

High-level languages embed software called a garbage collector, whose job is to keep track of memory allocation and usage in order to discover when allocated memory is no longer needed, in which case it automatically frees it.

Unfortunately, this time is an approximation, because not being able to determine whether a block of memory is needed is a common problem (one that cannot be solved algorithmically).

Most garbage collectors work by collecting memory that is not referenced; for example, all variables pointing to that memory are already out of scope. However, this collection of memory space is not sufficient, because at any point there may still be a piece of memory address that is referred to by a variable in scope, but that variable is never accessed again.

Garbage collection

Because it is indeterminate to find whether certain memory is “no longer needed,” the garbage collector is limited in its ability to implement solutions to this common problem. This section explains the importance of understanding the main concepts of garbage collection algorithms and their limitations.

Memory references

The main concept of the garbage collection algorithm is an aspect that needs to be referenced.

In the context of memory management, an object is said to reference another object if the preceding object has access to the following object (either implicitly or explicitly). For example, a JavaScript object has a reference to its prototype (implicit reference) and a reference to an attribute value (explicit reference).

In this context, the concept of “object” is extended to a much wider scope than regular JavaScript objects, and also includes functional scope (or global lexical scope).

The lexical scope defines how variable names are resolved in nested functions: the inner function scope can access the scope of the parent function, even if the parent function has returned.

Reference counting garbage collection

This is the simplest garbage collection algorithm. An object is considered “recyclable garbage” if it doesn’t have any Pointers to it.

Look at the following code:

var o1 = {
  o2: {
    x: 1}};// 2 objects are created. 
// 'o2' is referenced by 'o1' object as one of its properties.
// None can be garbage-collected

var o3 = o1; // the 'o3' variable is the second thing that 
            // has a reference to the object pointed by 'o1'. 
                                                       
o1 = 1;      // now, the object that was originally in 'o1' has a         
            // single reference, embodied by the 'o3' variable

var o4 = o3.o2; // reference to 'o2' property of the object.
                // This object has now 2 references: one as
                // a property. 
                // The other as the 'o4' variable

o3 = '374'; // The object that was originally in 'o1' has now zero
            // references to it. 
            // It can be garbage-collected.
            // However, what was its 'o2' property is still
            // referenced by the 'o4' variable, so it cannot be
            // freed.

o4 = null; // what was the 'o2' property of the object originally in
           // 'o1' has zero references to it. 
           // It can be garbage collected.
Copy the code

Cycle generation problem

When loops are involved, garbage collection mechanisms are limited. In the following example, a loop is created by creating two objects and referring to each other. After a function call, they go out of scope, so they are effectively invalid and can be released. However, the reference-counting algorithm considers that since both objects are referenced at least once, neither of them can be garbage collected.

function f() {
  var o1 = {};
  var o2 = {};
  o1.p = o2; // o1 references o2
  o2.p = o1; // o2 references o1. This creates a cycle.
}

f();
Copy the code

Mark clearing algorithm

To determine whether an object is needed, the algorithm determines whether the object is available.

Mark removal works through these three steps:

Root: In general, the root represents the global variable referenced in the code. For example, in JavaScript, the global variable that can act as the root is the “window” object. The same object in Node.js is called “global”. The garbage collector builds a complete list of all roots.

2. The algorithm then checks all the roots and their children and marks them as active (which means they are not garbage). Memory that does not belong to either root is marked as garbage.

Finally, the garbage collector frees any chunks of memory that are not marked as active and returns them to the operating system.

Since 2012, all modern browsers have shipped with a “tag sweep” garbage collector. All of the improvements that have been made in JavaScript garbage collection over the last few years (generation/incremental/parallel/parallel garbage collection) have been improvements to the implementation of the algorithm (tag cleanup), but not improvements to the garbage collection algorithm itself, or the goal of determining whether an object is reachable or not

In this article, you can learn more about trace garbage collection, which also includes the tag scavenging algorithm and its optimizations.

Circular references are no longer a problem

In the first example above, after the function call returns, the two objects are no longer referenced by the accessible objects in the global object. Therefore, the garbage collector will mark them as inaccessible.

Counterintuitive behavior of the garbage collector

As convenient as garbage collectors are, they have their trade-offs. One is uncertainty. In other words, the garbage collector is unpredictable. You can’t really tell when the garbage collector is being executed. This means that in some cases, the program will use more memory than it really needs. In other cases, there may be a brief pause in particularly sensitive applications. Most implementations of garbage collectors share common modules that perform garbage collection during memory allocation, although uncertainty means that garbage collection cannot be determined when to perform it. If memory allocation is not performed, most garbage collectors remain idle. Consider the following scenario:

  1. Execution allocates a large set of memory.

  2. Most (or all) of these elements are marked as unreachable (assuming we set null references to a piece of memory we no longer need).

  3. No further memory allocation is performed.

In this case, most garbage collectors will not do any garbage collection. In other words, even if there are unreachable references that can be reclaimed, they are not marked by the collector. These are not strictly leaks, but can still result in higher than usual memory usage.

What is a memory leak?

As implied by memory, a memory leak is a piece of memory that has been used by an application in the past but is no longer needed, but has not yet been returned to the operating system or free memory pool.

Undetermined problems

Some programming languages provide features to help developers with memory allocation and reclamation. Others want developers to know exactly when a chunk of memory is unused. Wikipedia has a good article on manual and automatic memory management.

Four common JavaScript memory leaks

1: global variable

JavaScript handles undeclared variables in an interesting way: when referencing an undeclared variable, a new variable is created in the global object. In the browser, window is a global object, which means it

function foo(arg) {
    bar = "some text";
}
Copy the code

Is equivalent to:

function foo(arg) {
    window.bar = "some text";
}
Copy the code

Assume that the purpose of bar is to refer to only one variable in the function foo. However, if you don’t use var to declare it, you create an unnecessary global variable. In that case, the impact would not be significant. But you can certainly imagine a more destructive scenario.

You can also accidentally create a global variable by using this:

function foo() {
    this.var1 = "potential accidental global";
}
// Foo called on its own, this points to the global object (window)
// rather than being undefined.
foo();
Copy the code

You can avoid these mistakes by adding “use strict” to the beginning of your JavaScript file; It switches to a stricter JavaScript parsing mode to prevent accidental creation of global variables.

Unexpected global variables are certainly a problem, but often your code may have a large number of explicitly declared global variables that, by definition, cannot be collected by the garbage collector. Special attention needs to be paid to global variables used to temporarily store and process large amounts of information. If you must use a global variable to store data, be sure to assign it to NULL or reassign it when you do not want it.

2: Forgotten timer or callback function

Let’s take setInterval as an example, because it’s often used in JavaScript.

Other libraries that provide observers and accept callbacks typically ensure that all references to incoming callbacks become inaccessible once their own instances become inaccessible. However, the following code is not uncommon:

var serverData = loadData();
setInterval(function() {
    var renderer = document.getElementById('renderer');
    if(renderer) {
        renderer.innerHTML = JSON.stringify(serverData); }},5000); //This will be executed every ~5 seconds.
Copy the code

The code snippet above shows the consequences of using timers to reference nodes or data that are no longer needed.

The Render object may be replaced or deleted at some point, making blocks wrapped by interval handlers redundant. If this happens, neither the handler nor its associated variables will be collected, because the timer (remember, it is still active) needs to be stopped the first time. It all comes down to the serverData function, which is sure to store and process load data that is not collected.

When using the observer mode, you need to ensure that explicit calls are made to remove them after they are no longer needed (either the observer is no longer needed, or the object will become unreachable).

Fortunately, most modern browsers can do this for you: Once observed objects become inaccessible, they automatically collect observer handlers, even if you forget to remove listeners. In the past, some browsers were unable to handle these cases (older versions of Internet Explorer 6).

However, it is best practice to remove the observer as soon as the object becomes obsolete. Look at the following example:

var element = document.getElementById('launch-button');
var counter = 0;
function onClick(event) {
   counter++;
   element.innerHtml = 'text ' + counter;
}
element.addEventListener('click', onClick);
// Do stuff
element.removeEventListener('click', onClick);
element.parentNode.removeChild(element);
// Now when element goes out of scope,
// both element and onClick will be collected even in old browsers // that don't handle cycles well.
Copy the code

You no longer need to call removeEventListener to make the node unreachable, because modern browsers support garbage collectors that can detect these loops and handle them appropriately.

If you take advantage of the jQuery function (which is also supported by other libraries and frameworks), you can also remove listeners before the node becomes obsolete. Even if the application is running under an older browser version, the library ensures that there are no memory leaks.

3: closures

Closures are a key part of JavaScript development: internal functions can access variables of external (enclosing) functions. Due to implementation details during JavaScript runtime, the following memory leaks may occur:

var theThing = null;
var replaceThing = function () {
  var originalThing = theThing;
  var unused = function () {
    if (originalThing) // a reference to 'originalThing'
      console.log("hi");
  };
  theThing = {
    longStr: new Array(1000000).join(The '*'),
    someMethod: function () {
      console.log("message"); }}; }; setInterval(replaceThing,1000);
Copy the code

Once replaceThing is called, theThing gets a new object, which consists of a large array and a new closure (someMethod). However, originalThing is referenced by a closure created by the unused variable (that is, the thing variable from which replaceThing was last called). Keep in mind that once a scope is created for a closure in the same parent scope, that scope is shared.

In this case, the scope created for the closure someMethod is shared with unused. Unused originalThing. Even if unused, someMethod can be used by theThing outside the scope of replaceThing (for example, somewhere globally). And because someMethod shares a closure scope with unused, an unused referenced originalThing is forced to remain active. This prevents garbage collection

All of this leads to considerable memory leaks. When the code snippet above is run repeatedly, you may see a surge in memory usage. And it does not shrink in size when the garbage collector is running. A linked list of closures is created (in this case, its root is the theThing variable), and each closure scope indirectly references the large array.

The problem was discovered by the Meteor team, who have an excellent article describing the problem in detail.

4: Use DOM references

In some cases, developers store DOM nodes in data structures. Suppose you want to quickly update a few rows in a table. If you store a reference to each DOM row in a lexical scope or array, there are two references to the same DOM element: one in the DOM tree and one in the dictionary. If you decide to delete these lines, you need to remember to delete both references.

var elements = {
    button: document.getElementById('button'),
    image: document.getElementById('image')};function doStuff() {
    elements.image.src = 'http://example.com/image_name.png';
}
function removeImage() {
    // The image is a direct child of the body element.
    document.body.removeChild(document.getElementById('image'));
    // At this point, we still have a reference to #button in the
    //global elements object. In other words, the button element is
    //still in memory and cannot be collected by the GC.
}
Copy the code

There is another consideration when referring to internal nodes or leaf nodes in a DOM tree. If you keep references to form cells (tags) in your code and decide to remove the table from the DOM, but keep references to specific cells, you can have a serious memory leak. You might think that the garbage collector would free everything except that cell. However, this is not the case. Because a cell is a child node of a table, and the child node retains a reference to the parent node, a single reference to a cell in a table will keep the entire table in memory.

We at SessionStack try to follow these best practices to write code that handles memory allocation correctly for the following reasons:

Once you integrate SessionStack into your web application, it starts logging everything: all DOM changes, user interactions, JavaScript exceptions, stack traces, failed network requests, debug messages, and so on.

Using SessionStack, you can replay problems in your web application as a video and see everything that happened to the user. All of this must be done without impacting the performance of the Web application.

Since the user can reload the page or navigate the application, all observers, interceptors, variable allocations, and so on must be handled correctly so that they do not cause any memory leaks or increase the memory consumption of our integrated Web application.

The resources

Ideas from www-bcf.usc.edu/~dkempe/CS1… Ideas from David Glasse at blog.meteor.com/an-interest… Ideas from Sebastian Peyrott’s auth0.com/blog/four-t… Concept section from MDN front-end documentation developer.mozilla.org/en-US/docs/…

The original link