Source analysis based on Android 11(R)

preface

In C++ object release is the responsibility of the programmer, whereas in Java object release is the responsibility of the GC. If a Java object holds a Native object through a pointer, when should the Native object be released? This can’t be done with the original GC, because the virtual machine can’t tell whether the Java object’s long field is a pointer or which native object it should point to.

The previous approach was to implement a Finalize method ina Java class, which will be called when a Java object is recycled. In this way, we can release native objects in Finalize method, so that Java resources and Native resources can be released simultaneously in GC process. However, The Finalize method has many defects and was eventually deprecated in JDK 9. Its replacement is Cleaner class.

directory

1. Problems to be solved

How to automatically release native objects and resources when Java objects are reclaimed?

This is the problem that Finalize and Cleaner want to solve. Application development at the Pure Java level usually does not involve Java objects holding Pointers to native objects, but for some complex classes it is indispensable. Bitmaps, for example, are often used in this way to put most of the memory consumption into the Native heap rather than the Java heap.

1.1 Disadvantages of Finalize

Finalize is very convenient to use. Overwrite a method and release resources in the method. Native resources can be released in two steps, so it is also loved by the majority of developers. But convenience sometimes comes at a price. The performance sacrifice is one thing, and in some cases the resulting memory errors are even more intolerable. Hans Boehm, the head of the Android Runtime team, gave a talk about finalize in Google IO 2017. If you are interested in the three disadvantages of Finalize, you can check the link on YouTube. I will summarize the three disadvantages here.

  1. If two objects become unreachable at the same time, their Finalize methods are executed in any order. Therefore, using a native pointer held by another object ina finalize method of one object will make it possible to access an already freed C++ object, resulting in native heap corruption.
  2. According to Java syntax rules, a Finalize method of an object can be called while its other methods are still executing. Therefore, other methods may have use-after-free problems if they are accessing the native pointer it holds.
  3. If the Java object is small and the holding native object is large, the display call is requiredSystem.gc()To trigger the GC early. Otherwise, it may take years to reach the trigger level by simply relying on the growth of the Java heap, while the garbage native objects will pile up.

Speaking of Hans Boehm, I have a feeling that I want to share with you. This senior was an undergraduate in 1974 (estimated to be around 65 years old), graduated from Dr. Cornell, but he is still working on the project front line, submitting many key codes in ART. I once asked him questions by email. He was very kind and gave detailed answers to questions about chicken like mine. According to the impetuous mentality of dismissal at the age of 35 in China, he did not become a leader at this age, but still wrote code in the front line, it was a failure. But can you say that when you see his profile?

I am an ACM Fellow, and a past Chair of ACM SIGPLAN (2001-2003). Until late 2017 I chaired the ISO C++ Concurrency Study Group (WG21/SG1), where I continue to actively participate.

In technology, many great contributions take time to sink in. Of course, in the business, technical depth is often overlooked because it doesn’t pay off early on. But I believe that as power grows, those who dig deep will always be rewarded. Because the dividend of the business is capped by technological innovation. Technological innovation requires pragmatism, and impetuous soil can only breed concepts and deception.

Pull a little far, said the disadvantage of Finalize, the following introduction of Cleaner advantages.

1.2 Advantages of Cleaner

33 /** 34 * General-purpose phantom-reference-based cleaners. 35 * 36 * <p> Cleaners are a lightweight and more robust alternative to finalization. 37 * They are lightweight because they are not created by the VM and thus do not 38 * require a JNI upcall to be created, and because their cleanup code is 39 * invoked directly by the reference-handler thread rather than by the 40 * finalizer thread. They are more robust because they use phantom references, 41 * the weakest type of reference object, thereby avoiding the nasty ordering 42 * problems inherent to finalization. 43 * 44 * <p> A cleaner tracks a referent object and encapsulates a thunk of arbitrary 45 * cleanup code. Some time after the GC detects that a cleaner's referent  has 46 * become phantom-reachable, the reference-handler thread will run the cleaner. 47 * Cleaners may also be invoked directly; they are thread safe and ensure that 48 * they run their thunks at most once. 49 * 50 * <p> Cleaners are not a replacement for finalization. They should be used 51 * only when the cleanup code is extremely simple and straightforward. 52 * Nontrivial cleaners are inadvisable since they risk blocking the 53 * reference-handler thread and  delaying further cleanup and finalization. 54 * 55 * 56 *@author Mark Reinhold
57  */
Copy the code

As you can see from the comments in the source code, Cleaner is a way of finalization that tracks the life cycle of an object and encapsulates arbitrary cleanup code. After the GC frees the object, the Reference-Handler thread runs the wrapped cleanup code to free the resource.

As Cleaner inherits from PhantomReference(virtual reference), compared with Finalize, it limits many abilities, such as the ability to access tracking objects. Due to these limitations, it also avoids many defects of Finalize. To put it bluntly, many defects of Finalize are due to its “ability”.

  1. If the Java object is small and the holding native object is large, the display call is requiredSystem.gc()To trigger the GC early. Otherwise, it may take years for the Java heap to grow to the trigger level, and the garbage generated by native objects will pile up.

The shortcoming of Finalize mentioned above is still not solved in the Cleaner. Actively triggering GC is flawed because developers don’t know how to control this frequency. Frequent usage can degrade performance, and rare usage can cause native resources not to be released in time. Therefore, Android introduced the NativeAllocationRegistry class from N. On the one hand, it simplified the way of using Cleaner, and on the other hand, it included the size of native resources into the GC triggering strategy. In this way, the GC that originally needed to be actively triggered by users could be automatically triggered. This topic behind will be specially written introduction, in this first press not table.

2. Design principle

2.1 When the Referent object is recycled

Referent object, commonly known as the reference object, that is, Cleaner needs to track the object. The Cleaner class inherits from PhantomReference class because it needs to make use of the characteristics of virtual reference: when tracking object is recycled, it is added to the ReferenceQueue, and then it can automatically complete the recovery of native resources. The following figure shows the process of adding a PhantomReference object to the ReferenceQueue.

Referent objects, when strongly referenced, are in the reachable state, and can be tagged with GC Root during the GC phase, so they are not collected. It is allowed to be reclaimed only if there are no strong references pointing to it. But allowing a collection and having a collection are two different things, which leads to the implementation of three weak reference types in Java.

  1. SoftReference (SoftReference) has two features. The first is that the referent object can be retrieved by GET, and the second is that the referent can survive only as long as it is referenced by it, until the heap is actually exhausted and the referent is reclaimed when OOM occurs. Often used to implement the Cache mechanism.
  2. WeakReference, WeakReference. When referent is only referenced by WeakReference, the referent object will be reclaimed in the next GC. So the only difference compared to SoftReference is when the referent is retrieved. It is often used in situations where the referent needs to be used, but you don’t want your references to affect the retrieval of the referent.
  3. PhantomReference: virtual reference. Compared with SoftReference and WeakReference, it cannot obtain the Referent object through GET. This limits its use scenario not to manipulate the referent, but to trigger events when the referent is reclaimed.

2.2 How do I List and process PhantomReference Objects

The enlisting of the PhantomReference object actually involves multiple threads. And Cleaner as a special PhantomReference, it has its own set of independent entry rules. The following are introduced separately.

2.2.1 Cleaner object entry and processing process

The Cleaner is treated as a special object in the process of the ReferenceQueueDaemon thread, so there is no need for developers to create a new thread to poll the ReferenceQueue. But it needs to be noted that all the Cleaner will be put in the ReferenceQueueDaemon thread for processing, so to ensure that Cleaner. Clean method to do things is fast, prevent blocking other Cleaner cleaning action.

2.2.2 Adding PhantomReference Objects and processing process

The normal PhantomReference object is eventually added to the ReferenceQueue passed in during construction. There are two ways to handle these ReferenceQueue, one is to call the ReferenceQueue. Poll method for non-blocking polling, and the other is to call the ReferenceQueue. Remove method for blocking wait. Generally speaking, the processing of ReferenceQueue requires the developer to open a new thread, so if too many ReferenceQueue are processed at the same time, thread resources will be wasted.

3. Source code analysis

This article analyzes the source code based on the Android 11(R) version, focusing on the ART virtual machine’s special processing of PhantomReference objects, which will involve some knowledge of GC.

3.1 Relationship between GC running and PhantomReference

For a Concurrent Copying Collector, GC can be roughly divided into Mark and Copy stages. After the Mark is finished, all marked objects are placed in the Mark Stack for subsequent processing.

3.1.1 Mark

art/runtime/gc/collector/concurrent_copying.cc

2205 inline void ConcurrentCopying::ProcessMarkStackRef(mirror::Object* to_ref) {...2292   if (perform_scan) {
2293     if (use_generational_cc_ && young_gen_) {
2294       Scan<true>(to_ref);
2295     } else {
2296       Scan<false>(to_ref);
2297     }
2298   }
Copy the code

After Mark is complete, the Collector walks through all the objects in the Mark Stack and performs Scan for each object. In Scan, the action of DelayReferenceReferent will be performed for each Reference object. If the referent pointed to by Reference is not marked, the modified Reference object will be added to the corresponding native queue.

art/runtime/gc/reference_processor.cc

232 // Process the "referent" field in a java.lang.ref.Reference. If the referent has not yet been
233 // marked, put it on the appropriate list in the heap for later processing.
234 void ReferenceProcessor::DelayReferenceReferent(ObjPtr<mirror::Class> klass,
...
243   if(! collector->IsNullOrMarkedHeapReference(referent,/*do_atomic_update=*/true)) {<==== If the referent is not marked, it will be recycled...257     if (klass->IsSoftReferenceClass()) {
258       soft_reference_queue_.AtomicEnqueueIfNotEnqueued(self, ref);
259     } else if (klass->IsWeakReferenceClass()) {
260       weak_reference_queue_.AtomicEnqueueIfNotEnqueued(self, ref);
261     } else if (klass->IsFinalizerReferenceClass()) {
262       finalizer_reference_queue_.AtomicEnqueueIfNotEnqueued(self, ref);
263     } else if(klass->IsPhantomReferenceClass()) {<============== Add it to the native phantom_reference_queue_264       phantom_reference_queue_.AtomicEnqueueIfNotEnqueued(self, ref);
265     } else {
266       LOG(FATAL) << "Invalid reference type " << klass->PrettyClass() << "" << std::hex
267                  << klass->GetAccessFlags();
268     }
269   }
270 }
Copy the code

What happens when PhantomReference is added to phantom_reference_queue_?

3.1.2 Copy stage

art/runtime/gc/collector/concurrent_copying.cc

1434 void ConcurrentCopying::CopyingPhase(a) {...1645     ProcessReferences(self);
Copy the code

During the Copy phase of the GC, the Collector executes the ProcessReferences function.

art/runtime/gc/reference_processor.cc

153 void ReferenceProcessor::ProcessReferences(bool concurrent,
...
211   // Clear all phantom references with white referents.
212   phantom_reference_queue_.ClearWhiteReferences(&cleared_references_, collector);
Copy the code

The ProcesssReferences function adds Reference from PHANtom_reference_queue_ to cleared_references_. Phantom_reference_queue_ contains only PhantomReference, while CLEared_Reference_ also contains SoftReference and WeakReference.

3.1.3 Post-GC stage

After GC execution, CollectClearedReferences are called to generate a task to handle cleared_References_, followed by Run to execute it.

art/runtime/gc/heap.cc

2671   collector->Run(gc_cause, clear_soft_references || runtime->IsZygote()); <====== where GC is actually performed2672   IncrementFreedEver(a);2673   RequestTrim(self);
2674   // Collect cleared references.
2675   SelfDeletingTask* clear = reference_processor_->CollectClearedReferences(self); <====== generates a task to handle cleared_references_2676   // Grow the heap so that we know when to perform the next GC.
2677   GrowForUtilization(collector, bytes_allocated_before_gc);
2678   LogGC(gc_cause, collector);
2679   FinishGC(self, gc_type); < = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = the end of the round of GC2680   // Actually enqueue all cleared references. Do this after the GC has officially finished since
2681   // otherwise we can deadlock.
2682   clear->Run(self); < = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = to just generate processing cleared_references_ taskCopy the code

art/runtime/gc/reference_processor.cc

281   void Run(Thread* thread) override {
282     ScopedObjectAccess soa(thread);
283     jvalue args[1];
284     args[0].l = cleared_references_;
285     InvokeWithJValues(soa, nullptr, WellKnownClasses::java_lang_ref_ReferenceQueue_add, args); <===== calls the Java method286     soa.Env() - >DeleteGlobalRef(cleared_references_);
287   }
Copy the code

Run inside the cleared_references_ as parameters, invokes the Java lang. Ref. ReferenceQueue. The add method. This brings us from native to the Java world.

libcore/ojluni/src/main/java/java/lang/ref/ReferenceQueue.java

261     static void add(Reference
        list) {
262         synchronized (ReferenceQueue.class) {
263             if (unenqueued == null) {
264                 unenqueued = list;
265             } else {
266                 // Find the last element in unenqueued.
267Reference<? > last = unenqueued;268                 while(last.pendingNext ! = unenqueued) {269                   last = last.pendingNext;
270                 }
271                 // Add our list to the end. Update the pendingNext to point back to enqueued.
272                 last.pendingNext = list;
273                 last = list;
274                 while(last.pendingNext ! = list) {275                     last = last.pendingNext;
276                 }
277                 last.pendingNext = unenqueued;
278             }
279             ReferenceQueue.class.notifyAll();      // When all the elements in cleared_references_ have been added to Java's global ReferenceQueue, call notifyAll to wake up the ReferenceQueueDaemon thread
280         }
281     }
Copy the code

3.2 The transfer station ReferenceQueueDaemon thread

The ReferenceQueueDaemon thread is suspended when no task is coming.

libcore/libart/src/main/java/java/lang/Daemons.java

211         @Override public void runInternal(a) {
212             while (isRunning()) {
213Reference<? > list;214                 try {
215                     synchronized (ReferenceQueue.class) {
216                         while (ReferenceQueue.unenqueued == null) {
217ReferenceQueue.class.wait(); <========== suspends the thread by calling wait218                         }
219                         list = ReferenceQueue.unenqueued;
220                         ReferenceQueue.unenqueued = null;
221                     }
222                 } catch (InterruptedException e) {
223                     continue;
224                 } catch (OutOfMemoryError e) {
225                     continue;
226                 }
227                 ReferenceQueue.enqueuePending(list);
228             }
229         }
Copy the code

When a new task comes, ReferenceQueueDaemon thread from ReferenceQueue. Class. Wait. For the elements in the global ReferenceQueue, Cleaner and other PhantomReference are handled differently, which will be described separately below.

3.2.1 How to deal with Cleaner objects

The global ReferenceQueue distributes internal elements by calling enqueuePending. Each Reference object is constructed with a ReferenceQueue as a parameter, which is the queue in which the Reference object ends up after distribution.

libcore/ojluni/src/main/java/java/lang/ref/ReferenceQueue.java

219     public static void enqueuePending(Reference
        list) {
220Reference<? > start = list;221         do {
222ReferenceQueue queue = list.queue; <========== retrieves the ReferenceQueue object passed in when each Reference object is constructed223             if (queue == null) {
224Reference<? > next = list.pendingNext;225 
226                 // Make pendingNext a self-loop to preserve the invariant that
227                 // once enqueued, pendingNext is non-null -- without leaking
228                 // the object pendingNext was previously pointing to.
229                 list.pendingNext = list;
230                 list = next;
231             } else {
232                 // To improve performance, we try to avoid repeated
233                 // synchronization on the same queue by batching enqueue of
234                 // consecutive references in the list that have the same
235                 // queue.
236                 synchronized (queue.lock) {
237                     do {
238Reference<? > next = list.pendingNext;239 
240                         // Make pendingNext a self-loop to preserve the
241                         // invariant that once enqueued, pendingNext is
242                         // non-null -- without leaking the object pendingNext
243                         // was previously pointing to.
244                         list.pendingNext = list;
245queue.enqueueLocked(list); <========= remove the Reference object from the global ReferenceQueue and add it to the ReferenceQueue to which the object belongs246                         list = next;
247                     } while(list ! = start && list.queue == queue);248                     queue.lock.notifyAll();
249                 }
250             }
251         } while(list ! = start);252     }
Copy the code

As for the Cleaner object, it is not actually added to the ReferenceQueue passed in when it is constructed, but is handled directly in enqueueLocked.

libcore/ojluni/src/main/java/java/lang/ref/ReferenceQueue.java

66     private boolean enqueueLocked(Reference<? extends T> r) {
67         // Verify the reference has not already been enqueued.
68         if(r.queueNext ! =null) {
69             return false;
70         }
71 
72         if (r instanceof Cleaner) {
73             // If this reference is a Cleaner, then simply invoke the clean method instead
74             // of enqueueing it in the queue. Cleaners are associated with dummy queues that
75             // are never polled and objects are never enqueued on them.
76             Cleaner cl = (sun.misc.Cleaner) r;
77cl.clean(); <============= is done by calling cl.clean()nativeResource Release78 
79             // Update queueNext to indicate that the reference has been
80             // enqueued, but is now removed from the queue.
81             r.queueNext = sQueueNextUnenqueued;
82             return true;
83         }
84 
85         if (tail == null) {
86             head = r;
87         } else {
88             tail.queueNext = r;
89         }
90         tail = r;
91         tail.queueNext = r;
92         return true;
93     }
Copy the code

3.2.2 How to Handle Other PhantomReference Objects

As you can see from lines 85 to 92 of the code above, other Phantomreferences will eventually be added to the corresponding ReferenceQueue to form a linked list structure. Once added, the corresponding processing thread is woken up by calling queue.lock.notifyAll.

libcore/ojluni/src/main/java/java/lang/ref/ReferenceQueue.java

219     public static void enqueuePending(Reference
        list) {
236                 synchronized (queue.lock) {
237                     do{...245queue.enqueueLocked(list); <========= remove the Reference object from the global ReferenceQueue and add it to the ReferenceQueue to which the object belongs246                         list = next;
247                     } while(list ! = start && list.queue == queue);248queue.lock.notifyAll(); .252     }
Copy the code

[Comparison of Cleaner and other Phantomreferences]

type Cleaner Other PhantomReference
Whether to add to the ReferenceQueue passed in at construct time ✔ ️
The final processing is placed in the ReferenceQueueDaemon ✔ ️
The final processing is done in the custom thread ✔ ️

4. Actual cases

Inside the NativeAllocationRegistry, Cleaner is used to recover native resources actively. It passes two parameters to Cleaner.create, a Java object to trace, and CleanThunk, which specifies the method to recycle.

libcore/luni/src/main/java/libcore/util/NativeAllocationRegistry.java

243         try {
244             thunk = new CleanerThunk();
245Cleaner cleaner = Cleaner.create(referent, thunk); .253         thunk.setNativePtr(nativePtr);
Copy the code

Cleaner is inherited from PhantomReference, and its construction method has two kinds. It can be seen from line 115 that the ReferenceQueue finally passed in is dummyQueue. Dummy means false and virtual, indicating that this dummyQueue has no actual role. This is consistent with our analysis in 3.2.1 above.

libcore/ojluni/src/main/java/sun/misc/Cleaner.java

114     private Cleaner(Object referent, Runnable thunk) {
115         super(referent, dummyQueue); <===== PhantomReference's constructor requires the ReferenceQueue parameter passed in116         this.thunk = thunk;
117     }
Copy the code

The nativePtr inside CleanerThunk is used to record Pointers to native objects, and freeFunction is an instance field of Outer class NativeAllocationRegistry that records function Pointers to native layer resource release functions. With these two Pointers, native resource recycling can be completed.

libcore/luni/src/main/java/libcore/util/NativeAllocationRegistry.java

259     private class CleanerThunk implements Runnable {
260         private long nativePtr;
261 
262         public CleanerThunk(a) {
263             this.nativePtr = 0;
264         }
265 
266         public void run(a) {
267             if(nativePtr ! =0) {
268applyFreeFunction(freeFunction, nativePtr); <======== applyFreeFunction eventually calls freeFunction, and the argument passed to freeFunction is nativePtr269                 registerNativeFree(size);
270             }
271         }
272 
273         public void setNativePtr(long nativePtr) {
274             this.nativePtr = nativePtr; < = = = = = = = = = = = = = = nativePtr isnativeObject pointer275         }
276     }
Copy the code

When ReferenceQueueDaemon polls a Cleaner object, its clean method is called. As you can see, thunk. Run is called on line 143 and finally enters the resource release function of the native world.

libcore/ojluni/src/main/java/sun/misc/Cleaner.java

139     public void clean(a) {
140         if(! remove(this))
141             return;
142         try {
143thunk.run(); <=================== internally calls the resource release function144         } catch (final Throwable x) {
145             AccessController.doPrivileged(new PrivilegedAction<Void>() {
146                     public Void run(a) {
147                         if(System.err ! =null)
148                             new Error("Cleaner terminated abnormally", x)
149                                 .printStackTrace();
150                         System.exit(1);
151                         return null;
152                     }});
153         }
154     }
Copy the code