Most Java programmers know, and believe, that “unused objects should be manually assigned to null”; The reason for this is usually “to help GC reclaim memory earlier and reduce memory footprint”, but you can’t answer the question further.

As there are so many misinformation about this issue on the Internet, this article will analyze the significance of the operation of “assigning null when the object is no longer in use” through examples. This article tries not to use jargon, but you still need to have some idea of the JVM.

The sample code

Let’s look at a very simple piece of code:

public static void main(String[] args) { if (true) { byte[] placeHolder = new byte[64 * 1024 * 1024]; System.out.println(placeHolder.length / 1024); } System.gc(); }

We instantiate an array placeHolder in the if and then pass system.gc () outside the if scope; The GC is manually triggered to reclaim the placeHolder because the placeHolder is no longer accessible. Take a look at the output:

65536[GC 68239K->65952K, 0.0014820 secs][Full GC 65952K->65881K, 0.0093860 secs]

Full GC 65952K->65881K; Full GC 65952K->65881K; The placeHolder has not been reclaimed by the GC.

Let’s look at the case where “unused objects should be manually assigned to NULL” is followed:

public static void main(String[] args) { if (true) { byte[] placeHolder = new byte[64 * 1024 * 1024]; System.out.println(placeHolder.length / 1024); placeHolder = null; } System.gc(); }

Its output is:

65536[GC 68239K->65952K(125952K), 0.0014910 secs][Full GC 65952K->345K(125952K), 0.0099610 secs]

After this GC, the memory footprint is reduced to 345K, i.e., placeHolder has been successfully reclaimed! Compare the two pieces of code. Simply assigning the placeHolder value to NULL solves the GC problem, thanks to “unused objects should be manually assigned to NULL”.

PlaceHolder {placeHolder}; placeHolder {placeHolder}; placeHolder {placeHolder}; placeHolder {placeHolder}; That is the crux of the matter.

The runtime stack

A typical runtime stack

If you know anything about compilation, or the underlying mechanics of program execution, you know that when a method executes, variables (local variables) in the method are allocated on the stack; Of course, in Java, the object that comes out of new is in the heap, but there’s also a pointer to that object in the stack, just like int.

For example, for this code:

public static void main(String[] args) { int a = 1; int b = 2; int c = a + b; }

The state of the runtime stack can be interpreted as:

Index **** variable

1a

2b

3c

The “index” indicates the number of variables in the stack, and variables are placed on the stack according to the order in which the code within the method is executed.

Such as:

public static void main(String[] args) { if (true) { int a = 1; int b = 2; int c = a + b; } int d = 4; }

The runtime stack is:

Index **** variable

1a

2b

3c

4d

Easy to understand? In fact, if you think about this example, there is room for optimization in the runtime stack.

Java stack optimization

In the example above, the main() method takes up four stacks of index space when it runs, but it doesn’t need to. When the if execution is complete, variables A, B, and C are not reachable, so they can be recycled from the stack index 1 to 3, for example:

Index **** variable

1a

2b

3c

1d

Variable D reuses the stack index of variable A, which saves memory.

remind

The above “runtime stack” and “index” are deliberately invented terms for ease of introduction, and in fact are called “local variable table” and “Slot” in the JVM, respectively. Moreover, the local variable scale is determined at compile time and does not need to wait until “run time”.

The GC glance

Here’s a quick look at a very simple piece of mainstream GC: how to determine that objects can be reclaimed. Another way of saying it is, how do you determine that an object is alive.

If you think about it, in the Java world, objects are related to objects, and we can access from one object to another. As shown in the figure.

If you think about it, the reference relationships between these objects are like a big picture; More specifically, it’s a lot of trees.

If we find all the roots, then we can go down from the roots and find all the living objects, then the objects that are not found are dead! This allows the GC to reclaim those objects.

Now the question is, how do you find the roots? The JVM already has rules, one of which is: objects referenced in the stack. That is, as long as the object in the heap has a reference in the stack, it is considered alive.

remind

The name of the algorithm described above for determining that objects can be reclaimed is “Reachability Analysis Algorithm.”

The JVM “bug”

Let’s go back to our original example:

public static void main(String[] args) { if (true) { byte[] placeHolder = new byte[64 * 1024 * 1024]; System.out.println(placeHolder.length / 1024); } System.gc(); }

Take a look at the runtime stack:

LocalVariableTable:Start Length Slot Name Signature 0 21 0 args [Ljava/lang/String; 5 12 1 placeHolder [B

The first index in the stack is the method passing parameter args, which is of type String[]; The second index, placeHolder, is of type byte[].

PlaceHolder hasn’t been reclaimed: system.gc (); There are also references to args and placeHolder in the runtime stack of the main() method when the GC is triggered. The GC determines that both objects are alive and will not be collected. That is, the code in the placeHolder is out of the scope of the if, but there are no reads or writes to the runtime stack, and the index in the placeHolder has not been reused by other variables, so the GC determines it to be alive.

To verify this inference, we use system.gc (); Earlier, declare a variable that will reuse the placeHolder index, as mentioned earlier in “Java stack optimization.”

public static void main(String[] args) { if (true) { byte[] placeHolder = new byte[64 * 1024 * 1024]; System.out.println(placeHolder.length / 1024); } int replacer = 1; System.gc(); }

Take a look at the runtime stack:

LocalVariableTable:Start Length Slot Name Signature 0 23 0 args [Ljava/lang/String; 5 12 1 placeHolder [B 19 4 1 replacer I

Not surprisingly, Replacer reuses placeHolder’s indexes. Let’s take a look at GC:

65536[GC 68239K->65984K(125952K), 0.0011620 secs][Full GC 65984K->345K(125952K), 0.0095220 secs]

PlaceHolder has been successfully reclaimed! And our theory was validated.

Int replacer = 1 from the run-time stack; Setting the placeHolder value to null does the same thing: it disconnects the heap from the stack, allowing the GC to determine that the placeHolder is dead.

Now that the principle that unused objects should be manually assigned to null is clear, it all stems from a BUG in the JVM that code is not automatically cut off from the heap when it leaves the variable scope. Why does this “bug” persist? Don’t you think the odds of that happening are small? It’s a tradeoff.

conclusion

Hopefully by this point you’ve understood the meaning behind the saying that unused objects should be manually assigned to NULL. I tend to agree with the author of Understanding the Java Virtual Machine: feel free to use “unused objects should be manually assigned to null,” but it should not be relied upon too much, and should not be generalized as a general rule.