The JVM has JOL to look at the memory layout of an object, and ART is not an implementation of the JVM, so JOL is not suitable for ART.

This article will discuss the memory layout of objects on ART based on the Android Object Memory Footprint. Further, I will use the knowledge mentioned in this article to create a tool to show the memory layout of objects on the ART platform.

An overview of the

Memory layout refers to the in-memory layout of the internal attributes of an object, that is, the way the fields of an object are arranged in memory. In the current ART code (Android11-Release), the layout of objects in memory can be divided into two areas: Object headers, Instance Data, where the Instance Data area may contain Padding or Gap (we’ll explain why we use both terms below).

Next, this paper will introduce the arrangement of Object Header and Instance Data. It also shows how to get the layout information for objects at the application layer. Before we get started, some basic concepts:

  • Field Offset: Each Field has its own starting bit. The value of the Field can be read directly by adding the number of bytes occupied by its corresponding type
  • Padding or Gap: This is not necessary. Padding may occur when the ART is laying out Fields, while Padding may occur when the ART is allocating space (depending on the Allocator or object type).
  • Instace Fields: Non-static Fields, Fields owned by the instance object

In addition, the content of this article only discusses the behavior of the default configuration of ART. Those interested in understanding the behavior differences of other configurations can read the source code for themselves.

layout

Object Header

The Object header is the space occupied by the fields contained in the Object itself

// art/runtime/mirror/object.h

// C++ mirror of java.lang.Object
class MANAGED LOCKABLE Object {
 private:
  // The Class representing the type of the object.
  HeapReference<Class> klass_;
  // Monitor and hash code information.
  uint32_t monitor_;
}
Copy the code

The object header in ART contains much the same data as in HotSpot, where klass_ is a type pointer to the corresponding Class. Monitor_ corresponds to LockWord, similar to HotSpot’s MarkWord.

The HeapReference type in the code is actually uint32_t, so the object header size is 8 bytes.

In instance data, the offset of the first field is the size of the object header.

Instance Data

The instance data is Instace Fields. According to the name in the ART code, the padding generated when the Link Fields are linked in the class Linking stage is called gap. In order to distinguish this article from object alignment, gap is used to refer to the alignment fill generated by fields alignment, while padding is used to refer to the alignment fill generated by allocating object space, etc.

Padding

In ART, alignment padding (8 bytes) occurs when only String space is allocated under the default allocator. A non-default configuration, if the allocator for kAllocatorTypeBumpPointer, kAllocatorTypeTLAB, kAllocatorTypeRegion or kAllocatorTypeRegionTLAB, The allocated object space is aligned with 8 bytes. Since this article only discusses the logic of the default configuration, we won’t go into depth on padding.

Gap

-> long -> double -> int -> float -> char -> short -> Boolean Type fields in byte order. Each field is filled to determine whether the current offset is a multiple of the number of bytes occupied by the corresponding field type. If not, the field is aligned so that the offset is a multiple of the type size. In addition, in order not to waste too much space, when the size of the subsequent field occupation is less than or equal to the gap, the field will be filled in the gap.

Gap can be a bit abstract, but here are two examples.

Unused gap

The first is to produce gap but not gap situation

class Foo {
    private val obj: Any? = null
    private val aLong: Long = 3L
}
Copy the code

Foo will eventually have a gap of 4 bytes after processing according to the above rules, as follows:

  1. Fields sort, the result isobj -> aLong
  2. Use the size of the object header as the starting offset, which is 8 by default
  3. In order,objIs a reference type, which occupies a size of 4 (explained later), 8 is a multiple of 4, and gap is not required
  4. Fill in theobj, offset is 8
  5. Update current offset, offset += 4, that is, 12
  6. aLongThe current offset is 12. 12 is not a multiple of 8. Add 4 bytes of space
  7. Update offset based on gap. Current offset is 16
  8. Fill in theaLong, offset is 16

The final layout of Foo looks like this:

com.chaos.aol.sample.Foo object internals: OFFSET SIZE TYPE DESCRIPTION VALUE 0 4 (object header) 13 07 cc c8 (00010011 00000111 11001100 11001000) (319278280) 4 4  (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0) 8 4 java.lang.Object Foo.obj null 12 4 (alignment/padding gap) 16 8 long Foo.aLong 3 Instance size: 24 bytes (JVMTI: 24 bytes) Space losses: 4 bytes internal + 0 bytes external = 4 bytes totalCopy the code
Using the gap

Let’s talk about the gap and gap is used

class Bar {
    private val obj: Any? = null
    private val aInt: Int = 1
    private val aLong: Long = 3L
}
Copy the code

After Bar is processed in accordance with the above rules, there is no gap at last. The specific process is as follows:

  1. Fields sort, the result isobj -> aLong -> aInt
  2. (withFooConsistent process, omitted here)
  3. Since int is of size 4, andaLongThe 4 bytes gap generated during alignment is equal and meets the conditions for filling gapgap

The final layout of Bar is as follows:

com.chaos.aol.sample.Bar object internals: OFFSET SIZE TYPE DESCRIPTION VALUE 0 4 (object header) 13 32 e9 88 (00010011 00110010 11101001 10001000) (322103688) 4 4  (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0) 8 4 java.lang.Object Bar.obj null 12 4 int Bar.aInt 1 16 8 long Bar.aLong 3 Instance size: 24 bytes (JVMTI: 24 bytes) Space losses: 0 bytes internal + 0 bytes external = 0 bytes totalCopy the code

See the data in the range relative to Foo, [12, 16] is filled in aInt.

Note: If the data is smaller than gap, GAP will update and wait for the next use. For example, if there are four fields in an Object, such as Object, long, char, Boolean, gap will be updated to 2 after filling in char, Boolean will be updated to 1

Getting layout information

As you can see from the above, for overall layout information, we need to know the size of the object header, the offset of each field, the size of the field occupation, and the size of the gap. In addition, you need to compare the total size of the object with the end bit of the last field to see if there is padding.

To obtain the layout information, we need to calculate the following information:

  1. Object header size
  2. The offset of each field in the
  3. Field Indicates the bytes of the corresponding type
  4. The gap generated when the field is aligned
  5. Total object size

Object header size

Because native ::art::mirror:Object corresponds to java.lang.Object, we can reflect the objectSize of java.lang.object. class.

Field Offset

The sorting, alignment and other operations of Field are quite troublesome for us. Moreover, even if the fields of native layer have corresponding description in Java layer, they may not be obtained by reflection and cannot be calculated by themselves. Even though Unsafe’s objectFieldOffset method cannot read all fields that an object actually has, it is possible to obtain the exact offset for a Field.

Field size

This part directly looks at the ART source code

// art/runtime/art_field-inl.h

inline size_t ArtField::FieldSize(a) REQUIRES_SHARED(Locks::mutator_lock_) {
  return Primitive::ComponentSize(GetTypeAsPrimitiveType());
}

// art/libdexfile/dex/primitive.h

static constexpr size_t kObjectReferenceSize = 4;

class Primitive {
  static constexpr size_t ComponentSize(Type type) {
    switch (type) {
      case kPrimVoid:    return 0;
      case kPrimBoolean:
      case kPrimByte:    return 1;
      case kPrimChar:
      case kPrimShort:   return 2;
      case kPrimInt:
      case kPrimFloat:   return 4;
      case kPrimLong:
      case kPrimDouble:  return 8;
      case kPrimNot:     return kObjectReferenceSize;
    }
    LOG(FATAL) << "Invalid type " << static_cast<int>(type);
    UNREACHABLE();
  }
}
Copy the code

From the above code, the type occupancy relationship can be obtained in the following table:

type The number of bytes
void 0
boolean 1
byte 1
char 2
short 2
int 4
float 4
long 8
double 8
object_reference 4

The Gap

Since there is a clear order in the layout of fields, we can calculate the gap size between two fields by the following formula:

// Field gap = second. Offset-first. Offset-first. Take up the sizeCopy the code

Also, consider the size of the gap or padding after the last field. We can substitute the total size of the object for second. Offset to get the size of the gap or padding at the end of the object.

Total object size

According to the article “Android Object Memory Footprint”, we need to calculate in four cases:

  1. The array object
  2. String object
  3. Class object
  4. Ordinary objects

The array object

We need to know the size of the Header and the size of the data, which together is the total size of the array object.

Here are the two approaches to broadening:

  • int arrayBaseOffset(Class)— Returns the start bit of the array data store
  • int arrayIndexScale(Class)— Returns the size occupied by a single element of the array

The Array Header is equal to the start bit of the data, and the data size is the product of the size of a single element and the length of the Array. The size of the Array object can be calculated by the following formula:

size = arrayBaseOffset(clazz) + arrayIndexScale(clazz) * length
Copy the code

String object

Follow the formula in Android Object Memory Usage

size = RoundUp(16 + (IsCompressed() ? 1 : 2) * length,  8);
Copy the code

IsCompressed() : true if all characters in String are ASCII, false otherwise. RoundUp(x, 8) is aligned with 8 bytes.

Class object

The classSize of the Class object is the size of the Class object.

Unsafe has a getInt(Object obj, long offset) method, and if we can calculate the offset for classSize, we can getInt the value for classSize.

To do this, let’s look at the definition of Class in ART:

// art/runtime/mirror/class.h

// C++ mirror of java.lang.Class
class MANAGED Class final : public Object {
 private:
  HeapReference<ClassLoader> class_loader_;
  HeapReference<Class> component_type_;
  HeapReference<DexCache> dex_cache_;
  HeapReference<ClassExt> ext_data_;
  HeapReference<IfTable> iftable_;
  HeapReference<String> name_;
  HeapReference<Class> super_class_;
  HeapReference<PointerArray> vtable_;
  uint64_t ifields_;
  uint64_t methods_;
  uint64_t sfields_;
  uint32_t access_flags_;
  uint32_t class_flags_;
  uint32_t class_size_;
  pid_t clinit_thread_id_;// It is int32_t
  int32_t dex_class_def_idx_;
  int32_t dex_type_idx_;
  uint32_t num_reference_instance_fields_;
  uint32_t num_reference_static_fields_;
  uint32_t object_size_;
  uint32_t object_size_alloc_fast_path_;
  uint32_t primitive_type_;
  uint32_t reference_instance_offsets_;
  uint32_t status_;
  uint16_t copied_methods_offset_;
  uint16_t virtual_methods_offset_;
};
Copy the code

Uint32_t = 1 uint32_t = 1 This means that class_size_ cannot be filled into the gap generated earlier, even if the fields in front are filled with alignment.

Imagine that if the HeapReference section is filled with alignment, the resulting gap G1 must be less than 4. The uint64_t part is also filled and the gap G2 generated must be less than 8. The uint32_t size is 4, the first uint32_t (i.e. Access_flags_) can only be filled with G2, and the subsequent uint32_t will be laid out in the order defined. So class_size_ cannot be filled into G1 or G2.

If you take the class_FLAGs_ or clinit_thread_ID_ offset, you can calculate the class_size_ offset. Clinit_thread_id_ in the Java layer corresponds to clinitThreadId, which can be obtained by using the following method:

val clinitThreadIdField =
        Class::class.java.getDeclaredField("clinitThreadId").apply { isAccessible = true }
val classSizeFieldOffset = Unsafe.objectFieldOffset(clinitThreadIdField) - 4/* int32_t */

fun sizeOfClassObject(clazz: Class< * >): Int = Unsafe.getInt(clazz, classSizeFieldOffset)
Copy the code

Ordinary objects

ObjectSize in the corresponding Class of the object is the size of the object, which can be directly retrieved by reflection.

conclusion

In conclusion, object layout on ART platform is mainly divided into two areas: object header and instance data, among which there may be gap in the instance data area. In addition, for strings or special allocator objects, there is a padding area at the end of the object.

To obtain the memory layout of the object, the key data are the size of the object header, the offset of the field, the size of the field, and the total size of the object. The API eventually calls Object::SizeOf, which gets the exact SizeOf each type of Object.

In the end, knowing the memory layout of an object doesn’t give us much benefit except to know that the actual size of an object is not necessarily equal to its definition.

The author studies this knowledge point mainly wants to make a tool for ART platform, similar to JOL. The first version has been completed and named AOL(Android Object Layout). (The memory layout information for Foo and Bar in the above sample code is obtained directly from AOL)

Refer to the content

  • Java primitive array layout in memory
  • Analyze the memory layout of the object in depth
  • How much memory a Java object takes up
  • JOL