preface

Memory alignment is a word we hear all the time, but what it really means is shrouded in mystery. Today we will explore memory alignment and uncover its mystery. As the saying goes, practice is the only criterion for testing truth. The memory size of the object type, the actual memory size of the object, and the memory size allocated by the system are printed below. The code and print are as follows:

LWPerson * person = [LWPerson alloc]; person.name = @"person"; person.age = 18; LWPerson * newPerson; NSLog(@" memory sizeof object type --%lu",sizeof(person)); NSLog(@" actual memory size of object --%lu",class_getInstanceSize([person class])); NSLog(@" system allocated memory size --%lu",malloc_size((__bridge const void *)(person))); NSLog(@"=================="); NSLog(@" memory sizeof object type --%lu",sizeof(newPerson)); NSLog(@" actual memory size of object --%lu",class_getInstanceSize([newPerson class])); NSLog(@" system allocated memory size --%lu",malloc_size((__bridge const void *)(newPerson)));Copy the code
2021-06-08 11:16:28.097465+0800 alignStyle[73542:9731629] Memory size for object type --8 2021-06-08 11:16:28.097520+0800 AlignStyle [73542:9731629] Actual memory size of the object --24 2021-06-08 11:16:28.097562+0800 alignStyle[73542:9731629] System allocated memory size --32 The 2021-06-08 11:16:28. 097583 + 0800 alignStyle [73542-9731629] = = = = = = = = = = = = = = = = = = 2021-06-08 11:16:28. 097607 + 0800 AlignStyle [73542:9731629] Memory size of object type --8 2021-06-08 11:16:28.097629+0800 alignStyle[73542:9731629] Actual memory size of object --0 2021-06-08 11:16:28.097649+0800 alignStyle[73542:9731629] System allocated memory size --0Copy the code

Results analysis:

  • sizeof: Memory size of the object type,sizeofIt’s an operation notation, not a function. It calculates the size of the data type passed in, which is determined at compile time. sosizeof(person)sizeof(newPerson)Are all8Bytes, because that’s what they areStructure pointer.
  • class_getInstanceSize: The actual memory size of the object, which is determined by the size of the class’s member variables. In fact, it is not strictly the size of the object’s memory, as the memory is carried out8Byte alignment, the core algorithm isDefine WORD_MASK 7UL ((x + WORD_MASK) & ~WORD_MASK. sopersonThe memory size of24Rather than20newPersonIt just declares a variable, it doesn’t open up memory, so the size is0.
  • malloc_sizeThe memory size allocated by the system is16The way bytes are aligned, that is, the size of the allocation is16Multiples of, insufficient16The system will automatically fill in the bytes, note the system16Byte alignment is in the actual memory size (after8Byte alignment).

Question: What does class_getInstanceSize and malloc_size do underneath? How do you know class_getInstanceSize is 8-byte aligned? Malloc_size is 16 bytes aligned? The class_getInstanceSize and malloc_size exploration flows are placed at the end of the article.

Memory alignment

Foreword We find that the actual memory size of the object is 8 bytes aligned, so how exactly is aligned. This brings us to our focus on memory alignment.

Bytes occupied by each type

Here sort out some basic data types in different systems under the byte size, convenient for everyone to view.

Why memory alignment is needed

  • Memory is based in bytes,cpuWhen accessing data, theblockAccess in bytes, not bytes. Frequent access to unaligned data is greatly reducedcpuPerformance. When the bytes are aligned, they are reducedcpuAccess times, this space for time approach to reduce the purposecpuOverhead.
  • cpuAccess is in blocks, and access to unaligned data may start in one block and end in another. In this way, the middle may be combined together through complex operations, reducing efficiency. Byte alignment improves aftercpuThe access rate of.

Memory alignment principles

  1. Data member alignment rules: The data member of a struct (or union), the first data member is placed at offset 0, and the starting position of each data member is from the size of the member or the size of the member’s children (as long as the member has children, such as arrays). Structure, etc.) (for example, if an int is 4 bytes on a 32-bit machine, it is stored from a multiple of 4.

  2. Struct as members: If a structure has some struct members, the structure members are stored from an integer multiple of the size of the largest element in the structure.

  3. Finishing off: the total sizeof a structure, the result of sizeof, must be an integer multiple of its largest internal member. What is lacking must be made up.

The memory alignment principle is so complicated that I have to use examples to solve my confusion

Internal alignment of structures (no nesting)

The essence of an object is a structure, and the underlying implementation of an object is a structure. Memory alignment can actually be thought of as internal alignment of the structure, except that the system optimizes the memory of the instantiated object. The next example is to explore the internal alignment of structures

struct LWStruct1{
    double  a; // 8
    int     b; // 4
    short   c; // 2
    char    d; // 1
}LWStruct1;


struct LWStruct2{
    double  a; // 8
    char    d; // 1
    int     b; // 4
    short   c; // 2

}LWStruct2;
int main(int argc, char * argv[]) {
    @autoreleasepool {
        NSLog(@"-----%lu-----%lu",sizeof(LWStruct1),sizeof(LWStruct2));
    }
    return 0;
}
Copy the code
The 2021-06-08 15:02:54. 392903 + 0800 alignStyle (74021-9798038) -- -- -- -- -- -- -- -- -- 16-24Copy the code

The result analysis shows that LWStruct1 and LWStruct2 contain the same variables, but their positions are different, but their memory sizes are different. Why? This is the internal alignment of the structure.

The following is a simple calculation and analysis of the detailed process of LWStruct1 memory size based on the principle of memory alignment (min(m, n) m represents the current starting position, and n represents the size)

  • variableaAccounts for:8Bytes,offertfrom0To start,Min (0, 8), i.e.,0 ~ 7storea
  • variablebAccounts for:4Bytes,offertfrom8To start,Min (8, 4), i.e.,8 ~ 11storeb
  • variablecAccounts for:2Bytes,offertfrom12To start,Min (12, 2), i.e.,12 ~ 13storec
  • variabledAccounts for:1Bytes,offertfrom14To start,Min (14, 1), i.e.,14stored

The result shows that the actual memory size of LWStruct1 is 15 bytes, and the largest variable in LWStruct1 is A, which accounts for 8 bytes. Therefore, the actual memory size of LWStruct1 must be an integer multiple of 8, 15 is not an integer multiple of 8, rounded up, and the insufficiency of automatic completion is 16 bytes. Finally, the memory size of LWStruct1 is 16 bytes.

The analytical diagram of LWStruct1 is shown below

LWStruct2 Memory size detail procedure

  • variableaAccounts for:8Bytes,offertfrom0To start,Min (0, 8), i.e.,0 ~ 7storea
  • variabledAccounts for:1Bytes,offertfrom8To start,Min (8, 1), i.e.,8stored
  • variablebAccounts for:4Bytes,offertfrom9To start,Min (9, 4).9% 4! = 0And keep moving back until you find divisible4The location of the1212 ~ 15storeb
  • variablecAccounts for:2Bytes,offertfrom16To start,Min (16, 2), i.e.,16 ~ 17storec

The result shows that the actual memory size of LWStruct2 is 18 bytes, and the largest variable in LWStruct2 is A, which occupies 8 bytes. Therefore, the actual memory size of LWStruct2 must be an integer multiple of 8, 18 is not an integer multiple of 8, rounded up, the insufficiency of automatic completion is 24 bytes. Finally, the memory size of LWStruct2 is 24 bytes.

The analytical figure of LWStruct2 is as follows

A nested structure within a structure

 struct LWStruct1{
    double  a; // 8
    int     b; // 4
    short   c; // 2
    char    d; // 1
}LWStruct1;


struct LWStruct2{
    double  a; // 8
    char    d; // 1
    int     b; // 4
    short   c; // 2
 
}LWStruct2;

struct LWStruct3{
    long    a; // 8
    int     b; // 4
    short   c; // 2
    char    d; // 1
    struct LWStruct2 lwStr;
}LWStruct3;

int main(int argc, char * argv[]) {
    @autoreleasepool {
      NSLog(@"-----%lu-----%lu----%lu",sizeof(LWStruct1),sizeof(LWStruct2),sizeof(LWStruct3));
    }
    return 0;
}
 
Copy the code
The 2021-06-08 16:28:40. 819854 + 0800 alignStyle (74082-9819949) -- -- -- -- -- -- -- -- -- 16-24 to 40Copy the code

LWStruct3 Memory size detail procedure

  • variableaAccounts for:8Bytes,offertfrom0To start,Min (0, 8), i.e.,0 ~ 7storea
  • variablebAccounts for:4Bytes,offertfrom8To start,Min (8, 4), i.e.,8 ~ 11storeb
  • variablecAccounts for:2Bytes,offertfrom12To start,Min (12, 2), i.e.,12 ~ 13storec
  • variabledAccounts for:1Bytes,offertfrom14To start,Min (14, 1), i.e.,14stored
  • variablelwStr:lwStrAre structure variables, memory alignment principlesStructure members are stored from an integer multiple of the size of the largest element inside them.LWStruct2The largest proportion of variables in8Byte, sooffertfrom16To start,LWStruct2The memory size of18Bytes.Min (16, 18), i.e.,18 ~ 33storelwStr

The result shows that the actual memory size of LWStruct3 is 34 bytes, and the largest variable in LWStruct3 is lwStr and A are both 8 bytes. Therefore, the actual memory size of LWStruct3 must be an integer multiple of 8, 34 is not an integer multiple of 8, rounded up, and the insufficient automatic complement is 40 bytes. Finally, the memory size of LWStruct3 is 40 bytes.

The analytical figure of LWStruct3 is as follows

Memory optimization

The size of memory in the structure may vary depending on the order in which the variables are placed. Does this also apply to the memory alignment of the instantiated object? Well, look into it

LWPerson * person = [LWPerson alloc]; Person. A = 100.0; person.b = 'a'; person.c = 10; person.d = 2; NSLog(@"----%lu",class_getInstanceSize([person class]));Copy the code
The 2021-06-08 17:11:27. 743037 + 0800 alignStyle / 74104:9830701-24Copy the code

The order of the custom variables in LWPerson is exactly the same as that in LWStruct2, but the object comes with a variable isa pointer that takes up 8 bytes. So the custom variable in LWPerson takes up 16 bytes, which is weird, the order of the structures, the names are all the same. Why is that? This is the following to talk about memory optimization (how smart system waste so much memory, of course have to optimize ah), specifically see how to optimize

Char B, int C, short D, char B, int C, short D, char B, int C An 8-byte memory space is shared, and the object attributes or variables are stored in a different order than the structure. The system has been optimized for memory.

conclusion

Memory alignment has a set of rules designed to improve CPU access efficiency and secure access. Byte alignment may be a waste of memory, but memory optimization minimizes memory waste as much as possible, ensuring access speed and reducing memory waste, which is really good.

supplement

class_getInstanceSizeTo explore the

size_t class_getInstanceSize(Class cls) { if (! cls) return 0; return cls->alignedInstanceSize(); }Copy the code

Enter the alignedInstanceSize

 uint32_t alignedInstanceSize() const {
        return word_align(unalignedInstanceSize());
    }
Copy the code

Enter the word_align

#ifdef __LP64__ # define WORD_SHIFT 3UL # define WORD_MASK 7UL # define WORD_BITS 64 #else # define WORD_SHIFT 2UL # define WORD_MASK 3UL # define WORD_BITS 32 #endif static inline uint32_t word_align(uint32_t x) { return (x + WORD_MASK)  & ~WORD_MASK; }Copy the code

Conclusion: word_align how familiar words, not to say ok.

malloc_sizeTo explore the

Malloc_size wants to go in and see how it works, but click on the picture below

The malloc_size method implementation does not provide it. LibSystem_malloc, malloc_size, calloc, calloc, calloc, libSystem_malloc, malloc_size, calloc, calloc, calloc

void *p = calloc(1, 40);
Copy the code

Enter the calloc

calloc(size_t num_items, size_t size)
{
	return _malloc_zone_calloc(default_zone, num_items, size, MZ_POSIX);
}
Copy the code

Enter the _malloc_zone_calloc

_malloc_zone_calloc(malloc_zone_t *zone, size_t num_items, size_t size, malloc_zone_options_t mzo) { MALLOC_TRACE(TRACE_calloc | DBG_FUNC_START, (uintptr_t)zone, num_items, size, 0); void *ptr; if (malloc_check_start) { internal_check(); } PTR = zone->calloc(zone, num_items, size); if (os_unlikely(malloc_logger)) { malloc_logger(MALLOC_LOG_TYPE_ALLOCATE | MALLOC_LOG_TYPE_HAS_ZONE | MALLOC_LOG_TYPE_CLEARED, (uintptr_t)zone, (uintptr_t)(num_items * size), 0, (uintptr_t)ptr, 0); } MALLOC_TRACE(TRACE_calloc | DBG_FUNC_END, (uintptr_t)zone, num_items, size, (uintptr_t)ptr); if (os_unlikely(ptr == NULL)) { malloc_set_errno_fast(mzo, ENOMEM); } return ptr; }Copy the code

Enter zone->calloc I’m using the assembly method, just look at the assembly

Global search default_zone_calloc

default_zone_calloc(malloc_zone_t *zone, size_t num_items, size_t size)
{
	zone = runtime_default_zone();
	
	return zone->calloc(zone, num_items, size);
}
Copy the code

Zone ->calloc -> zone->calloc

Search nano_calloc globally

nano_calloc(nanozone_t *nanozone, size_t num_items, size_t size)
{
	size_t total_bytes;

	if (calloc_get_size(num_items, size, 0, &total_bytes)) {
		return NULL;
	}

	if (total_bytes <= NANO_MAX_SIZE) {
		void *p = _nano_malloc_check_clear(nanozone, total_bytes, 1);
		if (p) {
			return p;
		} else {
			/* FALLTHROUGH to helper zone */
		}
	}
	malloc_zone_t *zone = (malloc_zone_t *)(nanozone->helper_zone);
	return zone->calloc(zone, 1, total_bytes);
}
Copy the code

I’m going to go to _nano_malloc_check_clear and look at size_t because we’re only interested in the size of memory

Enter the segregated_size_to_fit

#define SHIFT_NANO_QUANTUM 4 #define NANO_REGIME_QUANTA_SIZE (1 << SHIFT_NANO_QUANTUM) // 16 static MALLOC_INLINE size_t  segregated_size_to_fit(nanozone_t *nanozone, size_t size, size_t *pKey) { size_t k, slot_bytes; if (0 == size) { size = NANO_REGIME_QUANTA_SIZE; // Historical behavior} //k = (size + 16-1) >> SHIFT_NANO_QUANTUM;  // round up and shift for number of quanta // slot_bytes = k << 4 slot_bytes = k << SHIFT_NANO_QUANTUM; // multiply by power of two quanta size *pKey = k - 1; // Zero-based! return slot_bytes; }Copy the code

Summary k >> 4 k << 4 hexadecimal alignment flowchart