There are four types of reference in Java (and some others such as FinalReference) : strong, soft, weak, and virtual.

Object a = new Object(); There is no Reference class in Java for such a form.

This article mainly analyzes the implementation of soft Reference, weak Reference and virtual Reference. These three Reference types are inherited from the Reference class, and the main logic is also in Reference.

The problem

Before we do that, what are some questions?

1. Most articles on the Internet introduce the introduction of soft reference: the soft reference will be reclaimed only when the memory is insufficient. How to define the memory shortage? What is out of memory?

2. Most articles on the Internet introduce virtual references as follows: virtual references are not in existence and will not determine the life cycle of objects. Mainly used to track the activity of objects being collected by the garbage collector. Is it really so?

3. Where are virtual references used in the Jdk?

Reference

Let’s take a look at a few fields in Reference.java

Public abstract class Reference<T> {// referent; ReferenceQueue<? Volatile ReferenceQueue<? super T> queue; // When the Reference is added to the queue, the field is set to the next element in the queue, thus making the list volatile Reference next; // During GC, the UNDERLYING JVM maintains a linked list called DiscoveredList, which contains Reference objects. Discovered fields refer to the next element in the list. JVM set TRANSIENT private Reference<T> discovered; Static private class Lock {} private static Lock Lock = new Lock(); // The Reference object waiting to be queued is set by the JVM during GC, Private static Reference<Object> Pending = NULL; private static Reference<Object> Pending = null; }Copy the code

The lifecycle of a Reference object is as follows:

It is mainly divided into Native layer and Java layer.

The Native layer adds Reference objects that need to be recycled to DiscoveredList during GC (the code is in the process_DISCOVERed_References method in referenceProcesser.cpp). Then move the DiscoveredList element to PendingList (the code is in the enqueue_DISCOVERed_ref_helper method in referenceProcessor.cpp). The PendingList starts with Reference Pending objects in the class.

Take a look at the Java layer code

private static class ReferenceHandler extends Thread { ... public void run() { while (true) { tryHandlePending(true); } } } static boolean tryHandlePending(boolean waitForNotify) { Reference<Object> r; Cleaner c; try { synchronized (lock) { if (pending ! = null) { r = pending; C = r instanceof Cleaner? (Cleaner) r : null; PendingList pending = r.discovered; r.discovered = null; } else {// If pending is null then wait. When an object is added to the PendingList, the JVM executes notify if (waitForNotify) {lock.wait(); } // retry if waited return waitForNotify; }}}... // If the CLeaner object, then call the clean method for resource collection. = null) { c.clean(); return true; } // Add Reference to the ReferenceQueue, and the developer can sense the event of the object being reclaimed from the poll element in the ReferenceQueue. ReferenceQueue<? super Object> q = r.queue; if (q ! = ReferenceQueue.NULL) q.enqueue(r); return true; }Copy the code

The process is relatively simple: it is a continuous extraction of elements from the PendingList, and then add them to the ReferenceQueue. The developer can sense the event of object recycling by polling elements from the ReferenceQueue.

Also note that there is additional processing for objects of the Cleaner type (inheriting from virtual references) : When the object it points to is recycled, the clean method will be called, which is mainly used to do the corresponding resource recycling. In the out-of-heap memory DirectByteBuffer, Cleaner is used to recycle out-of-heap memory, which is also the typical application of virtual reference in Java.

After looking at the implementation of Reference, take a look at the differences between the implementation classes.

SoftReference

public class SoftReference<T> extends Reference<T> { static private long clock; private long timestamp; public SoftReference(T referent) { super(referent); this.timestamp = clock; } public SoftReference(T referent, ReferenceQueue<? super T> q) { super(referent, q); this.timestamp = clock; } public T get() { T o = super.get(); if (o ! = null && this.timestamp ! = clock) this.timestamp = clock; return o; }}Copy the code

The implementation of soft reference is very simple, with two more fields: clock and TIMESTAMP. Clock is a static variable that is set to the current time each time GC is performed. The TIMESTAMP field assigns clock (if unequal and the object is not reclaimed) every time the GET method is called.

So what do these two fields do? What does this have to do with the fact that soft references are recycled when memory runs out?

It’s up to the JVM source code to do this, because it’s in the GC that you decide whether an object needs to be reclaimed.

size_t ReferenceProcessor::process_discovered_reflist( DiscoveredList refs_lists[], ReferencePolicy* policy, bool clear_referent, BoolObjectClosure* is_alive, OopClosure* keep_alive, VoidClosure* complete_gc, AbstractRefProcTaskExecutor* task_executor) { ... // Remember DiscoveredList mentioned above? Refs_lists is DiscoveredList. // Discover List is processed in several stages, SoftReference is processed in the first stage... for (uint i = 0; i < _max_num_q; i++) { process_phase1(refs_lists[i], policy, is_alive, keep_alive, complete_gc); }... } // This phase removes the SoftReference from the refs_list when the memory is sufficient. void ReferenceProcessor::process_phase1(DiscoveredList& refs_list, ReferencePolicy* policy, BoolObjectClosure* is_alive, OopClosure* keep_alive, VoidClosure* complete_gc) { DiscoveredListIterator iter(refs_list, keep_alive, is_alive); // Decide which softly reachable refs should be kept alive. while (iter.has_next()) { iter.load_ptrs(DEBUG_ONLY(! discovery_is_atomic() /* allow_null_referent */)); // referent_is_dead = (ite.referent ()! = NULL) && ! iter.is_referent_alive(); If (referent_is_dead &&!) if (referent_is_dead &&! policy->should_clear_reference(iter.obj(), _soft_ref_timestamp_clock)) { if (TraceReferenceGC) { gclog_or_tty->print_cr("Dropping reference (" INTPTR_FORMAT ": %s" ") by policy", (void *)iter.obj(), iter.obj()->klass()->internal_name()); } // Remove Reference object from list iter.remove(); // Make the Reference object active again iter.make_active(); // keep the referent around iter.make_referent_alive(); iter.move_to_next(); } else { iter.next(); }}... }Copy the code

Refs_lists contain some reference types found by this GC (virtual, soft, weak, etc.), and the process_DISCOVERed_reflist method is used to remove objects that do not need to be reclaimed from refs_lists. The last remaining elements of refs_lists are all elements that need to be recycled, and the first element is assigned to the reference.java #pending field mentioned above.

There are four implementations of ReferencePolicy: NeverClearPolicy, AlwaysClearPolicy, LRUCurrentHeapPolicy and LRUMaxHeapPolicy.

NeverClearPolicy always returns false (SoftReference is never reclaimed), and AlwaysClearPolicy always returns true (SoftReference is not used in the JVM). In the referenceProcessor.hpp#setup method, you can set policy to AlwaysClearPolicy.

LRUCurrentHeapPolicy and LRUMaxHeapPolicy have the same should_clear_reference method:

bool LRUMaxHeapPolicy::should_clear_reference(oop p,
                                             jlong timestamp_clock) {
  jlong interval = timestamp_clock - java_lang_ref_SoftReference::timestamp(p);
  assert(interval >= 0, "Sanity check");

  // The interval will be zero if the ref was accessed since the last scavenge/gc.
  if(interval <= _max_interval) {
    return false;
  }

  return true;
}
Copy the code

Timestamp_clock is the static field clock of SoftReference. Java_lang_ref_SoftReference ::timestamp(p) Indicates the field TIMESTAMP. Interval is 0 if SoftReference#get is called after the last GC, otherwise it is the time difference between several GCS.

_max_interval represents a critical value, and its value is different between LRUCurrentHeapPolicy and LRUMaxHeapPolicy.

void LRUCurrentHeapPolicy::setup() {
  _max_interval = (Universe::get_heap_free_at_last_gc() / M) * SoftRefLRUPolicyMSPerMB;
  assert(_max_interval >= 0,"Sanity check");
}

void LRUMaxHeapPolicy::setup() {
  size_t max_heap = MaxHeapSize;
  max_heap -= Universe::get_heap_used_at_last_gc();
  max_heap /= M;

  _max_interval = max_heap * SoftRefLRUPolicyMSPerMB;
  assert(_max_interval >= 0,"Sanity check");
}
Copy the code

SoftRefLRUPolicyMSPerMB defaults to 1000, which depends on the size of the heap available since the last GC, and the size of the heap used at the time of the last GC.

This tells you exactly when a SoftReference is reclaimed. It depends on the policy used (default is LRUCurrentHeapPolicy), the size of the heap available, and the time when the SoftReference last called get.

WeakReference

public class WeakReference<T> extends Reference<T> { public WeakReference(T referent) { super(referent); } public WeakReference(T referent, ReferenceQueue<? super T> q) { super(referent, q); }}Copy the code

It can be seen that WeakReference only inherits Reference in the Java layer without making any changes. When was the referent field set to null? To figure this out, let’s look at the process_discovered_reflist method mentioned earlier:

size_t ReferenceProcessor::process_discovered_reflist( DiscoveredList refs_lists[], ReferencePolicy* policy, bool clear_referent, BoolObjectClosure* is_alive, OopClosure* keep_alive, VoidClosure* complete_gc, AbstractRefProcTaskExecutor* task_executor) { ... //Phase 1: Remove all non-viable soft references from refs_lists that cannot be reclaimed (refs_lists are soft references) if (policy! = NULL) { if (mt_processing) { RefProcPhase1Task phase1(*this, refs_lists, policy, true /*marks_oops_alive*/); task_executor->execute(phase1); } else { for (uint i = 0; i < _max_num_q; i++) { process_phase1(refs_lists[i], policy, is_alive, keep_alive, complete_gc); } } } else { // policy == NULL assert(refs_lists ! = _discoveredSoftRefs, "Policy must be specified for soft references."); } // Phase2: // Remove all surviving references to objects if (mt_processing) {RefProcPhase2Task phase2(*this, refs_lists,! discovery_is_atomic() /*marks_oops_alive*/); task_executor->execute(phase2); } else { for (uint i = 0; i < _max_num_q; i++) { process_phase2(refs_lists[i], is_alive, keep_alive, complete_gc); } } // Phase 3: If (mt_processing) {RefProcPhase3Task Phase3 (*this, refs_lists, clear_referent, true /*marks_oops_alive*/); task_executor->execute(phase3); } else { for (uint i = 0; i < _max_num_q; i++) { process_phase3(refs_lists[i], clear_referent, is_alive, keep_alive, complete_gc); } } return total_list_count; } void ReferenceProcessor::process_phase3(DiscoveredList& refs_list, bool clear_referent, BoolObjectClosure* is_alive, OopClosure* keep_alive, VoidClosure* complete_gc) { ResourceMark rm; DiscoveredListIterator iter(refs_list, keep_alive, is_alive); while (iter.has_next()) { iter.update_discovered(); iter.load_ptrs(DEBUG_ONLY(false /* allow_null_referent */)); If (clear_referent) {// NULL out referent; // If (clear_referent) {// NULL out referent; // If (clear_referent) {// NULL out referent; } else {// keep the referent around // flags the referent object as alive, which will not be recycled in this GC ite.make_referent_alive (); }... }... }Copy the code

Setting the field referent to null, whether for weak or other reference types, occurs in process_phase3, and the behavior is determined by the value of clear_referent. The value of clear_referent depends on the reference type.

ReferenceProcessorStats ReferenceProcessor::process_discovered_references( BoolObjectClosure* is_alive, OopClosure* keep_alive, VoidClosure* complete_gc, AbstractRefProcTaskExecutor* task_executor, GCTimer* gc_timer) { NOT_PRODUCT(verify_ok_to_handle_reflists()); . Process_discovered_reflist: clear_referent; // Soft References size_t soft_count = 0; { GCTraceTime tt("SoftReference", trace_time, false, gc_timer); soft_count = process_discovered_reflist(_discoveredSoftRefs, _current_soft_ref_policy, true, is_alive, keep_alive, complete_gc, task_executor); } update_soft_ref_master_clock(); // Weak references size_t weak_count = 0; { GCTraceTime tt("WeakReference", trace_time, false, gc_timer); weak_count = process_discovered_reflist(_discoveredWeakRefs, NULL, true, is_alive, keep_alive, complete_gc, task_executor); } // Final references size_t final_count = 0; { GCTraceTime tt("FinalReference", trace_time, false, gc_timer); final_count = process_discovered_reflist(_discoveredFinalRefs, NULL, false, is_alive, keep_alive, complete_gc, task_executor); } // Phantom references size_t phantom_count = 0; { GCTraceTime tt("PhantomReference", trace_time, false, gc_timer); phantom_count = process_discovered_reflist(_discoveredPhantomRefs, NULL, false, is_alive, keep_alive, complete_gc, task_executor); }... }Copy the code

The Soft references and Weak References clear_referent fields are passed true, which is what we expected: When the object is unreachable, the reference field is set to NULL, and the object is reclaimed (for soft references, the reference is removed from the Refs_list in Phase 1 if there is enough memory, and the Refs_list is an empty collection in Phase 3).

But for Final references and Phantom References, the clear_referent field is passed false, which means that the object referenced by these two Reference types, if nothing else, is alive as long as the Reference object is alive, The referenced object is not recycled. Final References is related to whether the object overrides the Finalize method, which is beyond the scope of this article. Let’s look at Phantom References next.

PhantomReference

public class PhantomReference<T> extends Reference<T> { public T get() { return null; } public PhantomReference(T referent, ReferenceQueue<? super T> q) { super(referent, q); }}Copy the code

As you can see, the get method of a virtual reference always returns NULL. Let’s see a demo.

public static void demo() throws InterruptedException { Object obj = new Object(); ReferenceQueue<Object> refQueue =new ReferenceQueue<>(); PhantomReference<Object> phanRef =new PhantomReference<>(obj, refQueue); Object objg = phanRef.get(); // Get null system.out.println (objg); // make obj garbage obj=null; System.gc(); Thread.sleep(3000); // Add phanRef to the refQueue after gc. extends Object> phanRefP = refQueue.remove(); Println (phanRefP==phanRef); // True system.out.println (phanRefP==phanRef); }Copy the code

Can be seen in the above code, reference to point to objects inaccessible for a ‘notice’ (actually all classes that inherit the References has this feature), it is important to note the GC, after the completion of phanRef. Referent still point to create the Object before, This means that the Object is never recycled!

The reason for this phenomenon has been stated at the end of the previous section: For Final References and Phantom References, the clear_referent field is passed false, meaning that objects referenced by these two reference types will not be reclaimed in the GC without additional processing.

For virtual references, from refqueue.remove (); Once you have the referenced object, you can call the clear method to force the relationship between the reference and the object so that the object can be reclaimed the next time it is ready for GC.

End

After reading the analysis, we can answer the questions raised at the beginning of the article:

1. We often see the introduction of soft reference on the Internet is: when the memory is insufficient, it will be reclaimed. How to define the memory is insufficient? Why is it out of memory?

Soft references are reclaimed when they run out of memory. The definition of running out of memory is related to the get time of the reference object and the current available memory size of the heap. The calculation formula has been given above.

2. Unlike other references, virtual references do not determine the life cycle of an object. Mainly used to track the activity of objects being collected by the garbage collector. Is it really so?

Strictly speaking, a virtual reference affects the object’s life cycle. As long as a virtual reference is not reclaimed, the object referenced by it will never be reclaimed if nothing is done. So in general, if a PhantomReference object is not reclaimed after being retrieved from the ReferenceQueue (such as being referenced by another GC ROOT reachable object), You need to call the clear method to de-reference PhantomReference and its reference object.

3. Where are virtual references used in the Jdk?

DirectByteBuffer uses a virtual reference subclass Cleaner. Java to reclaim out-of-heap memory. We will write a subsequent article on the ins and out of out-of-heap memory.

Write in the last

Welcome to pay attention to my public number [calm as code], massive Java related articles, learning materials will be updated in it, sorting out the data will be placed in it.

If you think it’s written well, click a “like” and add a follow! Point attention, do not get lost, continue to update!!