Source analysis based on Android 11(R)
preface
In C++ object release is the responsibility of the programmer, whereas in Java object release is the responsibility of the GC. If a Java object holds a Native object through a pointer, when should the Native object be released? This can’t be done with the original GC, because the virtual machine can’t tell whether the Java object’s long field is a pointer or which native object it should point to.
The previous approach was to implement a Finalize method ina Java class, which will be called when a Java object is recycled. In this way, we can release native objects in Finalize method, so that Java resources and Native resources can be released simultaneously in GC process. However, The Finalize method has many defects and was eventually deprecated in JDK 9. Its replacement is Cleaner class.
directory
1. Problems to be solved
How to automatically release native objects and resources when Java objects are reclaimed?
This is the problem that Finalize and Cleaner want to solve. Application development at the Pure Java level usually does not involve Java objects holding Pointers to native objects, but for some complex classes it is indispensable. Bitmaps, for example, are often used in this way to put most of the memory consumption into the Native heap rather than the Java heap.
1.1 Disadvantages of Finalize
Finalize is very convenient to use. Overwrite a method and release resources in the method. Native resources can be released in two steps, so it is also loved by the majority of developers. But convenience sometimes comes at a price. The performance sacrifice is one thing, and in some cases the resulting memory errors are even more intolerable. Hans Boehm, the head of the Android Runtime team, gave a talk about finalize in Google IO 2017. If you are interested in the three disadvantages of Finalize, you can check the link on YouTube. I will summarize the three disadvantages here.
- If two objects become unreachable at the same time, their Finalize methods are executed in any order. Therefore, using a native pointer held by another object ina finalize method of one object will make it possible to access an already freed C++ object, resulting in native heap corruption.
- According to Java syntax rules, a Finalize method of an object can be called while its other methods are still executing. Therefore, other methods may have use-after-free problems if they are accessing the native pointer it holds.
- If the Java object is small and the holding native object is large, the display call is required
System.gc()
To trigger the GC early. Otherwise, it may take years to reach the trigger level by simply relying on the growth of the Java heap, while the garbage native objects will pile up.
Speaking of Hans Boehm, I have a feeling that I want to share with you. This senior was an undergraduate in 1974 (estimated to be around 65 years old), graduated from Dr. Cornell, but he is still working on the project front line, submitting many key codes in ART. I once asked him questions by email. He was very kind and gave detailed answers to questions about chicken like mine. According to the impetuous mentality of dismissal at the age of 35 in China, he did not become a leader at this age, but still wrote code in the front line, it was a failure. But can you say that when you see his profile?
I am an ACM Fellow, and a past Chair of ACM SIGPLAN (2001-2003). Until late 2017 I chaired the ISO C++ Concurrency Study Group (WG21/SG1), where I continue to actively participate.
In technology, many great contributions take time to sink in. Of course, in the business, technical depth is often overlooked because it doesn’t pay off early on. But I believe that as power grows, those who dig deep will always be rewarded. Because the dividend of the business is capped by technological innovation. Technological innovation requires pragmatism, and impetuous soil can only breed concepts and deception.
Pull a little far, said the disadvantage of Finalize, the following introduction of Cleaner advantages.
1.2 Advantages of Cleaner
33 /** 34 * General-purpose phantom-reference-based cleaners. 35 * 36 * <p> Cleaners are a lightweight and more robust alternative to finalization. 37 * They are lightweight because they are not created by the VM and thus do not 38 * require a JNI upcall to be created, and because their cleanup code is 39 * invoked directly by the reference-handler thread rather than by the 40 * finalizer thread. They are more robust because they use phantom references, 41 * the weakest type of reference object, thereby avoiding the nasty ordering 42 * problems inherent to finalization. 43 * 44 * <p> A cleaner tracks a referent object and encapsulates a thunk of arbitrary 45 * cleanup code. Some time after the GC detects that a cleaner's referent has 46 * become phantom-reachable, the reference-handler thread will run the cleaner. 47 * Cleaners may also be invoked directly; they are thread safe and ensure that 48 * they run their thunks at most once. 49 * 50 * <p> Cleaners are not a replacement for finalization. They should be used 51 * only when the cleanup code is extremely simple and straightforward. 52 * Nontrivial cleaners are inadvisable since they risk blocking the 53 * reference-handler thread and delaying further cleanup and finalization. 54 * 55 * 56 *@author Mark Reinhold
57 */
Copy the code
As you can see from the comments in the source code, Cleaner is a way of finalization that tracks the life cycle of an object and encapsulates arbitrary cleanup code. After the GC frees the object, the Reference-Handler thread runs the wrapped cleanup code to free the resource.
As Cleaner inherits from PhantomReference(virtual reference), compared with Finalize, it limits many abilities, such as the ability to access tracking objects. Due to these limitations, it also avoids many defects of Finalize. To put it bluntly, many defects of Finalize are due to its “ability”.
- If the Java object is small and the holding native object is large, the display call is required
System.gc()
To trigger the GC early. Otherwise, it may take years for the Java heap to grow to the trigger level, and the garbage generated by native objects will pile up.
The shortcoming of Finalize mentioned above is still not solved in the Cleaner. Actively triggering GC is flawed because developers don’t know how to control this frequency. Frequent usage can degrade performance, and rare usage can cause native resources not to be released in time. Therefore, Android introduced the NativeAllocationRegistry class from N. On the one hand, it simplified the way of using Cleaner, and on the other hand, it included the size of native resources into the GC triggering strategy. In this way, the GC that originally needed to be actively triggered by users could be automatically triggered. This topic behind will be specially written introduction, in this first press not table.
2. Design principle
2.1 When the Referent object is recycled
Referent object, commonly known as the reference object, that is, Cleaner needs to track the object. The Cleaner class inherits from PhantomReference class because it needs to make use of the characteristics of virtual reference: when tracking object is recycled, it is added to the ReferenceQueue, and then it can automatically complete the recovery of native resources. The following figure shows the process of adding a PhantomReference object to the ReferenceQueue.
Referent objects, when strongly referenced, are in the reachable state, and can be tagged with GC Root during the GC phase, so they are not collected. It is allowed to be reclaimed only if there are no strong references pointing to it. But allowing a collection and having a collection are two different things, which leads to the implementation of three weak reference types in Java.
- SoftReference (SoftReference) has two features. The first is that the referent object can be retrieved by GET, and the second is that the referent can survive only as long as it is referenced by it, until the heap is actually exhausted and the referent is reclaimed when OOM occurs. Often used to implement the Cache mechanism.
- WeakReference, WeakReference. When referent is only referenced by WeakReference, the referent object will be reclaimed in the next GC. So the only difference compared to SoftReference is when the referent is retrieved. It is often used in situations where the referent needs to be used, but you don’t want your references to affect the retrieval of the referent.
- PhantomReference: virtual reference. Compared with SoftReference and WeakReference, it cannot obtain the Referent object through GET. This limits its use scenario not to manipulate the referent, but to trigger events when the referent is reclaimed.
2.2 How do I List and process PhantomReference Objects
The enlisting of the PhantomReference object actually involves multiple threads. And Cleaner as a special PhantomReference, it has its own set of independent entry rules. The following are introduced separately.
2.2.1 Cleaner object entry and processing process
The Cleaner is treated as a special object in the process of the ReferenceQueueDaemon thread, so there is no need for developers to create a new thread to poll the ReferenceQueue. But it needs to be noted that all the Cleaner will be put in the ReferenceQueueDaemon thread for processing, so to ensure that Cleaner. Clean method to do things is fast, prevent blocking other Cleaner cleaning action.
2.2.2 Adding PhantomReference Objects and processing process
The normal PhantomReference object is eventually added to the ReferenceQueue passed in during construction. There are two ways to handle these ReferenceQueue, one is to call the ReferenceQueue. Poll method for non-blocking polling, and the other is to call the ReferenceQueue. Remove method for blocking wait. Generally speaking, the processing of ReferenceQueue requires the developer to open a new thread, so if too many ReferenceQueue are processed at the same time, thread resources will be wasted.
3. Source code analysis
This article analyzes the source code based on the Android 11(R) version, focusing on the ART virtual machine’s special processing of PhantomReference objects, which will involve some knowledge of GC.
3.1 Relationship between GC running and PhantomReference
For a Concurrent Copying Collector, GC can be roughly divided into Mark and Copy stages. After the Mark is finished, all marked objects are placed in the Mark Stack for subsequent processing.
3.1.1 Mark
art/runtime/gc/collector/concurrent_copying.cc
2205 inline void ConcurrentCopying::ProcessMarkStackRef(mirror::Object* to_ref) {...2292 if (perform_scan) {
2293 if (use_generational_cc_ && young_gen_) {
2294 Scan<true>(to_ref);
2295 } else {
2296 Scan<false>(to_ref);
2297 }
2298 }
Copy the code
After Mark is complete, the Collector walks through all the objects in the Mark Stack and performs Scan for each object. In Scan, the action of DelayReferenceReferent will be performed for each Reference object. If the referent pointed to by Reference is not marked, the modified Reference object will be added to the corresponding native queue.
art/runtime/gc/reference_processor.cc
232 // Process the "referent" field in a java.lang.ref.Reference. If the referent has not yet been
233 // marked, put it on the appropriate list in the heap for later processing.
234 void ReferenceProcessor::DelayReferenceReferent(ObjPtr<mirror::Class> klass,
...
243 if(! collector->IsNullOrMarkedHeapReference(referent,/*do_atomic_update=*/true)) {<==== If the referent is not marked, it will be recycled...257 if (klass->IsSoftReferenceClass()) {
258 soft_reference_queue_.AtomicEnqueueIfNotEnqueued(self, ref);
259 } else if (klass->IsWeakReferenceClass()) {
260 weak_reference_queue_.AtomicEnqueueIfNotEnqueued(self, ref);
261 } else if (klass->IsFinalizerReferenceClass()) {
262 finalizer_reference_queue_.AtomicEnqueueIfNotEnqueued(self, ref);
263 } else if(klass->IsPhantomReferenceClass()) {<============== Add it to the native phantom_reference_queue_264 phantom_reference_queue_.AtomicEnqueueIfNotEnqueued(self, ref);
265 } else {
266 LOG(FATAL) << "Invalid reference type " << klass->PrettyClass() << "" << std::hex
267 << klass->GetAccessFlags();
268 }
269 }
270 }
Copy the code
What happens when PhantomReference is added to phantom_reference_queue_?
3.1.2 Copy stage
art/runtime/gc/collector/concurrent_copying.cc
1434 void ConcurrentCopying::CopyingPhase(a) {...1645 ProcessReferences(self);
Copy the code
During the Copy phase of the GC, the Collector executes the ProcessReferences function.
art/runtime/gc/reference_processor.cc
153 void ReferenceProcessor::ProcessReferences(bool concurrent,
...
211 // Clear all phantom references with white referents.
212 phantom_reference_queue_.ClearWhiteReferences(&cleared_references_, collector);
Copy the code
The ProcesssReferences function adds Reference from PHANtom_reference_queue_ to cleared_references_. Phantom_reference_queue_ contains only PhantomReference, while CLEared_Reference_ also contains SoftReference and WeakReference.
3.1.3 Post-GC stage
After GC execution, CollectClearedReferences are called to generate a task to handle cleared_References_, followed by Run to execute it.
art/runtime/gc/heap.cc
2671 collector->Run(gc_cause, clear_soft_references || runtime->IsZygote()); <====== where GC is actually performed2672 IncrementFreedEver(a);2673 RequestTrim(self);
2674 // Collect cleared references.
2675 SelfDeletingTask* clear = reference_processor_->CollectClearedReferences(self); <====== generates a task to handle cleared_references_2676 // Grow the heap so that we know when to perform the next GC.
2677 GrowForUtilization(collector, bytes_allocated_before_gc);
2678 LogGC(gc_cause, collector);
2679 FinishGC(self, gc_type); < = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = the end of the round of GC2680 // Actually enqueue all cleared references. Do this after the GC has officially finished since
2681 // otherwise we can deadlock.
2682 clear->Run(self); < = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = to just generate processing cleared_references_ taskCopy the code
art/runtime/gc/reference_processor.cc
281 void Run(Thread* thread) override {
282 ScopedObjectAccess soa(thread);
283 jvalue args[1];
284 args[0].l = cleared_references_;
285 InvokeWithJValues(soa, nullptr, WellKnownClasses::java_lang_ref_ReferenceQueue_add, args); <===== calls the Java method286 soa.Env() - >DeleteGlobalRef(cleared_references_);
287 }
Copy the code
Run inside the cleared_references_ as parameters, invokes the Java lang. Ref. ReferenceQueue. The add method. This brings us from native to the Java world.
libcore/ojluni/src/main/java/java/lang/ref/ReferenceQueue.java
261 static void add(Reference
list) {
262 synchronized (ReferenceQueue.class) {
263 if (unenqueued == null) {
264 unenqueued = list;
265 } else {
266 // Find the last element in unenqueued.
267Reference<? > last = unenqueued;268 while(last.pendingNext ! = unenqueued) {269 last = last.pendingNext;
270 }
271 // Add our list to the end. Update the pendingNext to point back to enqueued.
272 last.pendingNext = list;
273 last = list;
274 while(last.pendingNext ! = list) {275 last = last.pendingNext;
276 }
277 last.pendingNext = unenqueued;
278 }
279 ReferenceQueue.class.notifyAll(); // When all the elements in cleared_references_ have been added to Java's global ReferenceQueue, call notifyAll to wake up the ReferenceQueueDaemon thread
280 }
281 }
Copy the code
3.2 The transfer station ReferenceQueueDaemon thread
The ReferenceQueueDaemon thread is suspended when no task is coming.
libcore/libart/src/main/java/java/lang/Daemons.java
211 @Override public void runInternal(a) {
212 while (isRunning()) {
213Reference<? > list;214 try {
215 synchronized (ReferenceQueue.class) {
216 while (ReferenceQueue.unenqueued == null) {
217ReferenceQueue.class.wait(); <========== suspends the thread by calling wait218 }
219 list = ReferenceQueue.unenqueued;
220 ReferenceQueue.unenqueued = null;
221 }
222 } catch (InterruptedException e) {
223 continue;
224 } catch (OutOfMemoryError e) {
225 continue;
226 }
227 ReferenceQueue.enqueuePending(list);
228 }
229 }
Copy the code
When a new task comes, ReferenceQueueDaemon thread from ReferenceQueue. Class. Wait. For the elements in the global ReferenceQueue, Cleaner and other PhantomReference are handled differently, which will be described separately below.
3.2.1 How to deal with Cleaner objects
The global ReferenceQueue distributes internal elements by calling enqueuePending. Each Reference object is constructed with a ReferenceQueue as a parameter, which is the queue in which the Reference object ends up after distribution.
libcore/ojluni/src/main/java/java/lang/ref/ReferenceQueue.java
219 public static void enqueuePending(Reference
list) {
220Reference<? > start = list;221 do {
222ReferenceQueue queue = list.queue; <========== retrieves the ReferenceQueue object passed in when each Reference object is constructed223 if (queue == null) {
224Reference<? > next = list.pendingNext;225
226 // Make pendingNext a self-loop to preserve the invariant that
227 // once enqueued, pendingNext is non-null -- without leaking
228 // the object pendingNext was previously pointing to.
229 list.pendingNext = list;
230 list = next;
231 } else {
232 // To improve performance, we try to avoid repeated
233 // synchronization on the same queue by batching enqueue of
234 // consecutive references in the list that have the same
235 // queue.
236 synchronized (queue.lock) {
237 do {
238Reference<? > next = list.pendingNext;239
240 // Make pendingNext a self-loop to preserve the
241 // invariant that once enqueued, pendingNext is
242 // non-null -- without leaking the object pendingNext
243 // was previously pointing to.
244 list.pendingNext = list;
245queue.enqueueLocked(list); <========= remove the Reference object from the global ReferenceQueue and add it to the ReferenceQueue to which the object belongs246 list = next;
247 } while(list ! = start && list.queue == queue);248 queue.lock.notifyAll();
249 }
250 }
251 } while(list ! = start);252 }
Copy the code
As for the Cleaner object, it is not actually added to the ReferenceQueue passed in when it is constructed, but is handled directly in enqueueLocked.
libcore/ojluni/src/main/java/java/lang/ref/ReferenceQueue.java
66 private boolean enqueueLocked(Reference<? extends T> r) {
67 // Verify the reference has not already been enqueued.
68 if(r.queueNext ! =null) {
69 return false;
70 }
71
72 if (r instanceof Cleaner) {
73 // If this reference is a Cleaner, then simply invoke the clean method instead
74 // of enqueueing it in the queue. Cleaners are associated with dummy queues that
75 // are never polled and objects are never enqueued on them.
76 Cleaner cl = (sun.misc.Cleaner) r;
77cl.clean(); <============= is done by calling cl.clean()nativeResource Release78
79 // Update queueNext to indicate that the reference has been
80 // enqueued, but is now removed from the queue.
81 r.queueNext = sQueueNextUnenqueued;
82 return true;
83 }
84
85 if (tail == null) {
86 head = r;
87 } else {
88 tail.queueNext = r;
89 }
90 tail = r;
91 tail.queueNext = r;
92 return true;
93 }
Copy the code
3.2.2 How to Handle Other PhantomReference Objects
As you can see from lines 85 to 92 of the code above, other Phantomreferences will eventually be added to the corresponding ReferenceQueue to form a linked list structure. Once added, the corresponding processing thread is woken up by calling queue.lock.notifyAll.
libcore/ojluni/src/main/java/java/lang/ref/ReferenceQueue.java
219 public static void enqueuePending(Reference
list) {
236 synchronized (queue.lock) {
237 do{...245queue.enqueueLocked(list); <========= remove the Reference object from the global ReferenceQueue and add it to the ReferenceQueue to which the object belongs246 list = next;
247 } while(list ! = start && list.queue == queue);248queue.lock.notifyAll(); .252 }
Copy the code
[Comparison of Cleaner and other Phantomreferences]
type | Cleaner | Other PhantomReference |
---|---|---|
Whether to add to the ReferenceQueue passed in at construct time | ❌ | ✔ ️ |
The final processing is placed in the ReferenceQueueDaemon | ✔ ️ | ❌ |
The final processing is done in the custom thread | ❌ | ✔ ️ |
4. Actual cases
Inside the NativeAllocationRegistry, Cleaner is used to recover native resources actively. It passes two parameters to Cleaner.create, a Java object to trace, and CleanThunk, which specifies the method to recycle.
libcore/luni/src/main/java/libcore/util/NativeAllocationRegistry.java
243 try {
244 thunk = new CleanerThunk();
245Cleaner cleaner = Cleaner.create(referent, thunk); .253 thunk.setNativePtr(nativePtr);
Copy the code
Cleaner is inherited from PhantomReference, and its construction method has two kinds. It can be seen from line 115 that the ReferenceQueue finally passed in is dummyQueue. Dummy means false and virtual, indicating that this dummyQueue has no actual role. This is consistent with our analysis in 3.2.1 above.
libcore/ojluni/src/main/java/sun/misc/Cleaner.java
114 private Cleaner(Object referent, Runnable thunk) {
115 super(referent, dummyQueue); <===== PhantomReference's constructor requires the ReferenceQueue parameter passed in116 this.thunk = thunk;
117 }
Copy the code
The nativePtr inside CleanerThunk is used to record Pointers to native objects, and freeFunction is an instance field of Outer class NativeAllocationRegistry that records function Pointers to native layer resource release functions. With these two Pointers, native resource recycling can be completed.
libcore/luni/src/main/java/libcore/util/NativeAllocationRegistry.java
259 private class CleanerThunk implements Runnable {
260 private long nativePtr;
261
262 public CleanerThunk(a) {
263 this.nativePtr = 0;
264 }
265
266 public void run(a) {
267 if(nativePtr ! =0) {
268applyFreeFunction(freeFunction, nativePtr); <======== applyFreeFunction eventually calls freeFunction, and the argument passed to freeFunction is nativePtr269 registerNativeFree(size);
270 }
271 }
272
273 public void setNativePtr(long nativePtr) {
274 this.nativePtr = nativePtr; < = = = = = = = = = = = = = = nativePtr isnativeObject pointer275 }
276 }
Copy the code
When ReferenceQueueDaemon polls a Cleaner object, its clean method is called. As you can see, thunk. Run is called on line 143 and finally enters the resource release function of the native world.
libcore/ojluni/src/main/java/sun/misc/Cleaner.java
139 public void clean(a) {
140 if(! remove(this))
141 return;
142 try {
143thunk.run(); <=================== internally calls the resource release function144 } catch (final Throwable x) {
145 AccessController.doPrivileged(new PrivilegedAction<Void>() {
146 public Void run(a) {
147 if(System.err ! =null)
148 new Error("Cleaner terminated abnormally", x)
149 .printStackTrace();
150 System.exit(1);
151 return null;
152 }});
153 }
154 }
Copy the code