Each compiler on a particular platform has its own default “alignment coefficient” (also known as the alignment modulus). Programmers can change this by precompiling the command #pragma pack(n), n=1,2,4,8,16, where n is the “alignment factor” you want to specify.

Memory alignment rules

  1. Data member alignment rules: Data member of struct (or union), the first data member is placed at offset 0, and then each data member is stored in a position from min(size of the member or size of the member’s children)(as long as the member has children, such as data, Structure), the alignment coefficient of the current platform) is an integer multiple of the start address to be stored.
  2. Struct as members: If a struct has some struct members, the struct members are stored from an integer multiple of min(internal maximum element size or substruct maximum element size, alignment coefficient of current platform).
  3. Finishing off: the total sizeof the structure, the result of sizeof, must be an integer multiple of min(the sizeof its largest internal member or submember, the alignment coefficient of the current platform).
#pragma pack(8) struct ZHYStruct1 {double b; Int c; // int c; // 4 is a multiple of 4. // 1 is a multiple of 1. } MyStruct1; struct ZHYStruct2 { char a; // 1 0 short b; // 2 2-3 int c; // 4 4-7 double d; // 8 8-15 } MyStruct2; struct ZHYStruct3 { char a; // 1 takes 0 bits from the initial position char b; Struct ZHYStruct1 c; //16 is a multiple of 8. //8 is a multiples of 8. // long long f; } MyStruct3; struct ZHYStruct4 { char a; // 1 0 double d; // 8 8-15 short b; // 2 16-17 int c; // 4 20-23 } MyStruct4; #pragma pack(1) struct ZHYStruct5 { char a; // 1 0 double d; // 8 1-8 short b; // 2 9-10 int c; // 4 11-15 } MyStruct5; #pragma pack(4) struct ZHYStruct6 { char a; // 1 0 double d; // 8 4-11 short b; // 2 12-13 int c; // 4 16-19 } MyStruct6; MyStruct1---16 MyStruct2---16 MyStruct3---48 MyStruct4---24 MyStruct5---15 MyStruct6---20Copy the code

Memory alignment causes

After memory alignment, you can see that the size has increased. So why memory alignment? Since CPU memory access is not in byte units, but in blocks, each memory access can be 2/4/8/16 bytes, each memory access will incur a fixed overhead, reducing the number of memory access will improve the performance of the program. So the CPU will normally access operations in 2/4/8/16/32 bytes. These access units, or block sizes, are called memory access granularity. If there is no memory alignment, it will greatly increase the CPU consumption during the access process.

To illustrate the principle behind memory alignment, we use an example to illustrate the difference between reading data from unaligned and unaligned addresses. The example is simple: in an access granularity of 4 bytes, 4 bytes are read into the register from address 0, and 4 bytes are read into the register from address 1.

When reading data from address 0, it is reading data aligned with the address, which is done in a single read. Unaligned address data is read when reading from address 1. It takes two reads to complete.

After reading the data twice, the data from 0 to 3 is offset up by 1 byte and the data from 4 to 7 is offset down by 3 bytes. Finally, the two pieces of data are merged into a register.

Doing so many extra operations on a single unaligned memory data can be very costly to the CPU and significantly degrade CPU performance.

IOS optimization for memory alignment

First declare a class as follows:

@interface ZYPerson : NSObject

@property (nonatomic, copy) NSString *name;
@property (nonatomic, copy) NSString *nickName;
@property (nonatomic, copy) NSString *hobby;

@property (nonatomic, assign) long height;

@property (nonatomic) char c1;
@property (nonatomic) char c2;
@property (nonatomic) char c3;

@property (nonatomic, assign) int age1;
@property (nonatomic, assign) int age2;
@property (nonatomic, assign) short age3;
@end
Copy the code

Instantiate in main:

ZYPerson *person = [ZYPerson alloc];
person.name      = @"Cooci";
person.nickName  = @"KC";
person.age1      = 18;
person.age2      = 19;
person.age3      = 20;
person.c1        = 'a';
person.c2        = 'b';
person.c3        = 'c';
person.height    = 100;
person.hobby     = @"hobby";
        
NSLog(@"%lu-%lu-%lu", sizeof(person), class_getInstanceSize(ZYPerson.class), malloc_size((__bridge const void*)person));
Copy the code

The print result is as follows:

8-56-64
Copy the code
  1. Sizeof is a C/C++ operator that simply returns the number of bytes of memory an object or type occupies. Sizeof is evaluated at compile time, so it can be used as a constant expression.

    In this case, because person is a pointer, sizeof(person) returns a value of 8.

  2. class_getInstanceSize

/** * Returns the size of instances of a class. * * @param cls A class object. * * @return The size in bytes of instances of the class \e cls, or \c 0 if \e cls is \c Nil. */ size_t class_getInstanceSize(Class cls) { if (! cls) return 0; return cls->alignedInstanceSize(); } // May be unaligned depending on class's ivars. uint32_t unalignedInstanceSize() const { ASSERT(isRealized()); return data()->ro()->instanceSize; } // Class's ivar size rounded up to a pointer-size boundary. uint32_t alignedInstanceSize() const { return word_align(unalignedInstanceSize()); } static inline uint32_t word_align(uint32_t x) { return (x + WORD_MASK) & ~WORD_MASK; } __LP64__ define WORD_MASK 7ULCopy the code

This method yields a size aligned by 8 bits; in this case it returns 56. Why 56? The LLDB will print an instance of person in memory as follows:

(lldb) x/8gx person 0x60000132c640: 0x000000010cdd49a8 0x0000001400636261 0x60000132c650: 0x0000001300000012 0x000000010cdd2038 0x60000132c660: 0x000000010cdd2058 0x000000010cdd2078 0x60000132c670: 0x0000000000000064 0x0000000000000000 Continue to print the value: (LLDB) Po 0x000000010CDD49a8&0x00007ffffffffFF8ull ZYPerson proves that 0x000000010CDD49a8 corresponds to isa (LLDB) Po 0x00000014 of the Person instance 20 -- Age3 (LLDB) Po (char)0x63 'C' -- C3 (LLDB) Po (char)0x62 'B' -- C2 (LLDB) Po (char)0x61 'A' -- C1 (LLDB) Po 0x00000013 19 -- Corresponds to Age2 (LLDB) Po 0x00000012 18 -- Corresponds to Age1 (LLDB) Po 0x000000010CDD2038 Cooci -- Corresponds to name (LLDB) Po 0x000000010CDD2058 KC -- Corresponding to nickName (LLDB) Po 0x000000010CDD2078 Hobby -- Corresponding to Hobby (LLDB) Po 0x0000000000000064 100 -- Corresponds to heightCopy the code

You can see that the Person instance does take up 56 bytes in memory and corresponds to our attribute assignment. Note that although the instance will eventually be converted to an objC_Object structure, the order in which each property is disposed of in memory is not necessarily related to the order in which we declared the property. This is because Apple has done some optimization to adjust the location of each property in the structure.

struct PersonStruct {
    void *isa;
    char c1;
    char c2;
    char c3;
    int age3;
    int age1;
    int age2;
    NSString *name;
    NSString *nickName;
    NSString *hobby;
    long height;
} personStruct;

struct PersonStruct* personStructPtr = (__bridge struct PersonStruct*)person;
Copy the code

After being optimized by Apple,personThe order of the individual attributes of the instance structure is shown above.

  1. malloc_sizeThe function returns the actual amount of space allocated in memory.

In alloC process analysis, there are two areas that determine the size of the actual opening space.

  • cls->instanceSize(extraBytes)–> align16
static inline size_t align16(size_t x) {
    return (x + size_t(15)) & ~size_t(15);
}
Copy the code

The size obtained by this method is passed as an argument to the calloc method.

  • segregated_size_to_fit
#define SHIFT_NANO_QUANTUM 4 #define NANO_REGIME_QUANTA_SIZE (1 << SHIFT_NANO_QUANTUM) // 16 static MALLOC_INLINE size_t  segregated_size_to_fit(nanozone_t *nanozone, size_t size, size_t *pKey) { size_t k, slot_bytes; if (0 == size) { size = NANO_REGIME_QUANTA_SIZE; // Historical behavior } k = (size + NANO_REGIME_QUANTA_SIZE - 1) >> SHIFT_NANO_QUANTUM; // round up and shift for number of quanta slot_bytes = k << SHIFT_NANO_QUANTUM; // multiply by power of two quanta size *pKey = k - 1; // Zero-based! return slot_bytes; }Copy the code

One is to change the parameters of calloc method, the other is to do something in calloc method internal call flow.

Reference: iOS memory byte alignment