preface
Memory alignment is a word we hear all the time, but what it really means is shrouded in mystery. Today we will explore memory alignment and uncover its mystery. As the saying goes, practice is the only criterion for testing truth. The memory size of the object type, the actual memory size of the object, and the memory size allocated by the system are printed below. The code and print are as follows:
LWPerson * person = [LWPerson alloc]; person.name = @"person"; person.age = 18; LWPerson * newPerson; NSLog(@" memory sizeof object type --%lu",sizeof(person)); NSLog(@" actual memory size of object --%lu",class_getInstanceSize([person class])); NSLog(@" system allocated memory size --%lu",malloc_size((__bridge const void *)(person))); NSLog(@"=================="); NSLog(@" memory sizeof object type --%lu",sizeof(newPerson)); NSLog(@" actual memory size of object --%lu",class_getInstanceSize([newPerson class])); NSLog(@" system allocated memory size --%lu",malloc_size((__bridge const void *)(newPerson)));Copy the code
2021-06-08 11:16:28.097465+0800 alignStyle[73542:9731629] Memory size for object type --8 2021-06-08 11:16:28.097520+0800 AlignStyle [73542:9731629] Actual memory size of the object --24 2021-06-08 11:16:28.097562+0800 alignStyle[73542:9731629] System allocated memory size --32 The 2021-06-08 11:16:28. 097583 + 0800 alignStyle [73542-9731629] = = = = = = = = = = = = = = = = = = 2021-06-08 11:16:28. 097607 + 0800 AlignStyle [73542:9731629] Memory size of object type --8 2021-06-08 11:16:28.097629+0800 alignStyle[73542:9731629] Actual memory size of object --0 2021-06-08 11:16:28.097649+0800 alignStyle[73542:9731629] System allocated memory size --0Copy the code
Results analysis:
sizeof
: Memory size of the object type,sizeof
It’s an operation notation, not a function. It calculates the size of the data type passed in, which is determined at compile time. sosizeof(person)
和sizeof(newPerson)
Are all8
Bytes, because that’s what they areStructure pointer
.class_getInstanceSize
: The actual memory size of the object, which is determined by the size of the class’s member variables. In fact, it is not strictly the size of the object’s memory, as the memory is carried out8
Byte alignment, the core algorithm isDefine WORD_MASK 7UL ((x + WORD_MASK) & ~WORD_MASK
. soperson
The memory size of24
Rather than20
,newPerson
It just declares a variable, it doesn’t open up memory, so the size is0
.malloc_size
The memory size allocated by the system is16
The way bytes are aligned, that is, the size of the allocation is16
Multiples of, insufficient16
The system will automatically fill in the bytes, note the system16
Byte alignment is in the actual memory size (after8
Byte alignment).
Question: What does class_getInstanceSize and malloc_size do underneath? How do you know class_getInstanceSize is 8-byte aligned? Malloc_size is 16 bytes aligned? The class_getInstanceSize and malloc_size exploration flows are placed at the end of the article.
Memory alignment
Foreword We find that the actual memory size of the object is 8 bytes aligned, so how exactly is aligned. This brings us to our focus on memory alignment.
Bytes occupied by each type
Here sort out some basic data types in different systems under the byte size, convenient for everyone to view.
Why memory alignment is needed
- Memory is based in bytes,
cpu
When accessing data, theblock
Access in bytes, not bytes. Frequent access to unaligned data is greatly reducedcpu
Performance. When the bytes are aligned, they are reducedcpu
Access times, this space for time approach to reduce the purposecpu
Overhead. cpu
Access is in blocks, and access to unaligned data may start in one block and end in another. In this way, the middle may be combined together through complex operations, reducing efficiency. Byte alignment improves aftercpu
The access rate of.
Memory alignment principles
-
Data member alignment rules: The data member of a struct (or union), the first data member is placed at offset 0, and the starting position of each data member is from the size of the member or the size of the member’s children (as long as the member has children, such as arrays). Structure, etc.) (for example, if an int is 4 bytes on a 32-bit machine, it is stored from a multiple of 4.
-
Struct as members: If a structure has some struct members, the structure members are stored from an integer multiple of the size of the largest element in the structure.
-
Finishing off: the total sizeof a structure, the result of sizeof, must be an integer multiple of its largest internal member. What is lacking must be made up.
The memory alignment principle is so complicated that I have to use examples to solve my confusion
Internal alignment of structures (no nesting)
The essence of an object is a structure, and the underlying implementation of an object is a structure. Memory alignment can actually be thought of as internal alignment of the structure, except that the system optimizes the memory of the instantiated object. The next example is to explore the internal alignment of structures
struct LWStruct1{
double a; // 8
int b; // 4
short c; // 2
char d; // 1
}LWStruct1;
struct LWStruct2{
double a; // 8
char d; // 1
int b; // 4
short c; // 2
}LWStruct2;
int main(int argc, char * argv[]) {
@autoreleasepool {
NSLog(@"-----%lu-----%lu",sizeof(LWStruct1),sizeof(LWStruct2));
}
return 0;
}
Copy the code
The 2021-06-08 15:02:54. 392903 + 0800 alignStyle (74021-9798038) -- -- -- -- -- -- -- -- -- 16-24Copy the code
The result analysis shows that LWStruct1 and LWStruct2 contain the same variables, but their positions are different, but their memory sizes are different. Why? This is the internal alignment of the structure.
The following is a simple calculation and analysis of the detailed process of LWStruct1 memory size based on the principle of memory alignment (min(m, n) m represents the current starting position, and n represents the size)
- variable
a
Accounts for:8
Bytes,offert
from0
To start,Min (0, 8)
, i.e.,0 ~ 7
storea
- variable
b
Accounts for:4
Bytes,offert
from8
To start,Min (8, 4)
, i.e.,8 ~ 11
storeb
- variable
c
Accounts for:2
Bytes,offert
from12
To start,Min (12, 2)
, i.e.,12 ~ 13
storec
- variable
d
Accounts for:1
Bytes,offert
from14
To start,Min (14, 1)
, i.e.,14
stored
The result shows that the actual memory size of LWStruct1 is 15 bytes, and the largest variable in LWStruct1 is A, which accounts for 8 bytes. Therefore, the actual memory size of LWStruct1 must be an integer multiple of 8, 15 is not an integer multiple of 8, rounded up, and the insufficiency of automatic completion is 16 bytes. Finally, the memory size of LWStruct1 is 16 bytes.
The analytical diagram of LWStruct1 is shown below
LWStruct2 Memory size detail procedure
- variable
a
Accounts for:8
Bytes,offert
from0
To start,Min (0, 8)
, i.e.,0 ~ 7
storea
- variable
d
Accounts for:1
Bytes,offert
from8
To start,Min (8, 1)
, i.e.,8
stored
- variable
b
Accounts for:4
Bytes,offert
from9
To start,Min (9, 4)
.9% 4! = 0
And keep moving back until you find divisible4
The location of the12
即12 ~ 15
storeb
- variable
c
Accounts for:2
Bytes,offert
from16
To start,Min (16, 2)
, i.e.,16 ~ 17
storec
The result shows that the actual memory size of LWStruct2 is 18 bytes, and the largest variable in LWStruct2 is A, which occupies 8 bytes. Therefore, the actual memory size of LWStruct2 must be an integer multiple of 8, 18 is not an integer multiple of 8, rounded up, the insufficiency of automatic completion is 24 bytes. Finally, the memory size of LWStruct2 is 24 bytes.
The analytical figure of LWStruct2 is as follows
A nested structure within a structure
struct LWStruct1{
double a; // 8
int b; // 4
short c; // 2
char d; // 1
}LWStruct1;
struct LWStruct2{
double a; // 8
char d; // 1
int b; // 4
short c; // 2
}LWStruct2;
struct LWStruct3{
long a; // 8
int b; // 4
short c; // 2
char d; // 1
struct LWStruct2 lwStr;
}LWStruct3;
int main(int argc, char * argv[]) {
@autoreleasepool {
NSLog(@"-----%lu-----%lu----%lu",sizeof(LWStruct1),sizeof(LWStruct2),sizeof(LWStruct3));
}
return 0;
}
Copy the code
The 2021-06-08 16:28:40. 819854 + 0800 alignStyle (74082-9819949) -- -- -- -- -- -- -- -- -- 16-24 to 40Copy the code
LWStruct3 Memory size detail procedure
- variable
a
Accounts for:8
Bytes,offert
from0
To start,Min (0, 8)
, i.e.,0 ~ 7
storea
- variable
b
Accounts for:4
Bytes,offert
from8
To start,Min (8, 4)
, i.e.,8 ~ 11
storeb
- variable
c
Accounts for:2
Bytes,offert
from12
To start,Min (12, 2)
, i.e.,12 ~ 13
storec
- variable
d
Accounts for:1
Bytes,offert
from14
To start,Min (14, 1)
, i.e.,14
stored
- variable
lwStr
:lwStr
Are structure variables, memory alignment principlesStructure members are stored from an integer multiple of the size of the largest element inside them
.LWStruct2
The largest proportion of variables in8
Byte, sooffert
from16
To start,LWStruct2
The memory size of18
Bytes.Min (16, 18)
, i.e.,18 ~ 33
storelwStr
The result shows that the actual memory size of LWStruct3 is 34 bytes, and the largest variable in LWStruct3 is lwStr and A are both 8 bytes. Therefore, the actual memory size of LWStruct3 must be an integer multiple of 8, 34 is not an integer multiple of 8, rounded up, and the insufficient automatic complement is 40 bytes. Finally, the memory size of LWStruct3 is 40 bytes.
The analytical figure of LWStruct3 is as follows
Memory optimization
The size of memory in the structure may vary depending on the order in which the variables are placed. Does this also apply to the memory alignment of the instantiated object? Well, look into it
LWPerson * person = [LWPerson alloc]; Person. A = 100.0; person.b = 'a'; person.c = 10; person.d = 2; NSLog(@"----%lu",class_getInstanceSize([person class]));Copy the code
The 2021-06-08 17:11:27. 743037 + 0800 alignStyle / 74104:9830701-24Copy the code
The order of the custom variables in LWPerson is exactly the same as that in LWStruct2, but the object comes with a variable isa pointer that takes up 8 bytes. So the custom variable in LWPerson takes up 16 bytes, which is weird, the order of the structures, the names are all the same. Why is that? This is the following to talk about memory optimization (how smart system waste so much memory, of course have to optimize ah), specifically see how to optimize
Char B, int C, short D, char B, int C, short D, char B, int C An 8-byte memory space is shared, and the object attributes or variables are stored in a different order than the structure. The system has been optimized for memory.
conclusion
Memory alignment has a set of rules designed to improve CPU access efficiency and secure access. Byte alignment may be a waste of memory, but memory optimization minimizes memory waste as much as possible, ensuring access speed and reducing memory waste, which is really good.
supplement
class_getInstanceSize
To explore the
size_t class_getInstanceSize(Class cls) { if (! cls) return 0; return cls->alignedInstanceSize(); }Copy the code
Enter the alignedInstanceSize
uint32_t alignedInstanceSize() const {
return word_align(unalignedInstanceSize());
}
Copy the code
Enter the word_align
#ifdef __LP64__ # define WORD_SHIFT 3UL # define WORD_MASK 7UL # define WORD_BITS 64 #else # define WORD_SHIFT 2UL # define WORD_MASK 3UL # define WORD_BITS 32 #endif static inline uint32_t word_align(uint32_t x) { return (x + WORD_MASK) & ~WORD_MASK; }Copy the code
Conclusion: word_align how familiar words, not to say ok.
malloc_size
To explore the
Malloc_size wants to go in and see how it works, but click on the picture below
The malloc_size method implementation does not provide it. LibSystem_malloc, malloc_size, calloc, calloc, calloc, libSystem_malloc, malloc_size, calloc, calloc, calloc
void *p = calloc(1, 40);
Copy the code
Enter the calloc
calloc(size_t num_items, size_t size)
{
return _malloc_zone_calloc(default_zone, num_items, size, MZ_POSIX);
}
Copy the code
Enter the _malloc_zone_calloc
_malloc_zone_calloc(malloc_zone_t *zone, size_t num_items, size_t size, malloc_zone_options_t mzo) { MALLOC_TRACE(TRACE_calloc | DBG_FUNC_START, (uintptr_t)zone, num_items, size, 0); void *ptr; if (malloc_check_start) { internal_check(); } PTR = zone->calloc(zone, num_items, size); if (os_unlikely(malloc_logger)) { malloc_logger(MALLOC_LOG_TYPE_ALLOCATE | MALLOC_LOG_TYPE_HAS_ZONE | MALLOC_LOG_TYPE_CLEARED, (uintptr_t)zone, (uintptr_t)(num_items * size), 0, (uintptr_t)ptr, 0); } MALLOC_TRACE(TRACE_calloc | DBG_FUNC_END, (uintptr_t)zone, num_items, size, (uintptr_t)ptr); if (os_unlikely(ptr == NULL)) { malloc_set_errno_fast(mzo, ENOMEM); } return ptr; }Copy the code
Enter zone->calloc I’m using the assembly method, just look at the assembly
Global search default_zone_calloc
default_zone_calloc(malloc_zone_t *zone, size_t num_items, size_t size)
{
zone = runtime_default_zone();
return zone->calloc(zone, num_items, size);
}
Copy the code
Zone ->calloc -> zone->calloc
Search nano_calloc globally
nano_calloc(nanozone_t *nanozone, size_t num_items, size_t size)
{
size_t total_bytes;
if (calloc_get_size(num_items, size, 0, &total_bytes)) {
return NULL;
}
if (total_bytes <= NANO_MAX_SIZE) {
void *p = _nano_malloc_check_clear(nanozone, total_bytes, 1);
if (p) {
return p;
} else {
/* FALLTHROUGH to helper zone */
}
}
malloc_zone_t *zone = (malloc_zone_t *)(nanozone->helper_zone);
return zone->calloc(zone, 1, total_bytes);
}
Copy the code
I’m going to go to _nano_malloc_check_clear and look at size_t because we’re only interested in the size of memory
Enter the segregated_size_to_fit
#define SHIFT_NANO_QUANTUM 4 #define NANO_REGIME_QUANTA_SIZE (1 << SHIFT_NANO_QUANTUM) // 16 static MALLOC_INLINE size_t segregated_size_to_fit(nanozone_t *nanozone, size_t size, size_t *pKey) { size_t k, slot_bytes; if (0 == size) { size = NANO_REGIME_QUANTA_SIZE; // Historical behavior} //k = (size + 16-1) >> SHIFT_NANO_QUANTUM; // round up and shift for number of quanta // slot_bytes = k << 4 slot_bytes = k << SHIFT_NANO_QUANTUM; // multiply by power of two quanta size *pKey = k - 1; // Zero-based! return slot_bytes; }Copy the code
Summary k >> 4 k << 4 hexadecimal alignment flowchart