preface
This chapter learns about Chunk and Subpage memory allocation.
- Memory specification standardization: How to standardize the irregular memory size requested by users into Netty’s own memory specifications (Tiny, Small, Page, Huge).
- How Chunk allocates page-level memory (8K-16MB).
- How to allocate Tiny and Small memory (16B-4KB) between Chunk and Subpage?
- Memory release.
First, memory specification standardization
Memory standardization is to standardize the memory requested by users into the Netty memory block specifications, such as 16B, 32B, 1K, 4K, and 8K.
See PoolArena#normalizeCapacity for the method entry. Here can ignore directMemoryCacheAlignment memory alignment parameters, think its 0.
int normalizeCapacity(int reqCapacity) {
If the value is greater than or equal to 16MB, align the memory
if (reqCapacity >= chunkSize) {
return directMemoryCacheAlignment == 0 ? reqCapacity : alignCapacity(reqCapacity);
}
// If the value is greater than or equal to 512B, raise it to the NTH power of 2 (if it is already the NTH power of 2).
if(! isTiny(reqCapacity)) {int normalizedCapacity = reqCapacity;
normalizedCapacity --;
normalizedCapacity |= normalizedCapacity >>> 1;
normalizedCapacity |= normalizedCapacity >>> 2;
normalizedCapacity |= normalizedCapacity >>> 4;
normalizedCapacity |= normalizedCapacity >>> 8;
normalizedCapacity |= normalizedCapacity >>> 16;
normalizedCapacity ++;
if (normalizedCapacity < 0) {
normalizedCapacity >>>= 1;
}
return normalizedCapacity;
}
// Memory alignment is required if the value is smaller than 512 BYTES
if (directMemoryCacheAlignment > 0) {
return alignCapacity(reqCapacity);
}
// If it is a multiple of 16, return it directly
// Quantum-spaced What is it?
if ((reqCapacity & 15) = =0) {
return reqCapacity;
}
// Take a multiple of 16
return (reqCapacity & ~15) + 16;
}
Copy the code
There is different logic for the size of the memory requested by the user.
- The value is greater than or equal to 16MB.
- If the value is greater than or equal to 512 bytes, the value is 2 to the NTH power. The value corresponds to Small and Page memory blocks.
- If the value is smaller than 512 bytes, the value is a multiple of 16, corresponding to all Tiny size memory blocks and 512-byte Small memory blocks. As for the multiple of 16, there is only one line quantum-spaced in the comment, I don’t understand.
We can learn about bitwise operations.
IsTiny: Determines whether a number is less than a given power of 2 to the n (512). 0xFFFFFE00 The hexadecimal value is -512.
static boolean isTiny(int normCapacity) {
return (normCapacity & 0xFFFFFE00) = =0;
}
Copy the code
Raise 2 to the NTH power, return if it is already 2 to the NTH power
int normalizedCapacity = reqCapacity;
// This line ADAPTS to reqCapacity, which is already 2 to the NTH power, to 2 to the NTH power -1
normalizedCapacity --;
// Apply the highest 1 bit to all the lowest bits (only focus on the highest 1 bit)
// This step must be 00000111111, the first occurrence of 1 is the original highest 1
normalizedCapacity |= normalizedCapacity >>> 1;
normalizedCapacity |= normalizedCapacity >>> 2;
normalizedCapacity |= normalizedCapacity >>> 4;
normalizedCapacity |= normalizedCapacity >>> 8;
normalizedCapacity |= normalizedCapacity >>> 16;
// Add 1 to make sure it's 2 to the NTH power
normalizedCapacity ++;
// Consider overflow cases (reqCapacity= integer.max_value), where 1 starts with 31 zeros
if (normalizedCapacity < 0) {
// Set it to 01 +30 zeros
normalizedCapacity >>>= 1;
}
Copy the code
Is it a multiple of 2 to the NTH
(reqCapacity & 15) = =0```` ** up2A multiple of the n power of ** :(reqCapacity & ~15In addition to the said16I'm going to round down, and then I'm going to add16. But you have to figure out if it's already16Multiple of, return directly, otherwise wrong. ```java (reqCapacity & ~15) + 16
Copy the code
2. Chunk 8K-16MB memory is allocated
8K-16MB memory allocation does not require Subpage participation, only Chunk is required.
The whole process is divided into two steps:
- Allocation: Assigns a node in a memoryMap to a user, that is, determines the location (offset and length) of the 16MB memory block occupied by this allocation.
- Initialize theInitialize PooledByteBuf with the given 16MB memory block location information.The process entry isPoolChunk#allocate.
boolean allocate(PooledByteBuf<T> buf, int reqCapacity, int normCapacity, PoolThreadCache threadCache) {
final long handle;
// Greater than 1 page 8192
if((normCapacity & subpageOverflowMask) ! =0) { // >= pageSize
handle = allocateRun(normCapacity);
}
/ / is less than 8192
else {
handle = allocateSubpage(normCapacity);
}
if (handle < 0) {
return false;
}
// If there is a ByteBuffer in the cache, use the buffer in the cache. Only Direct will exist
// The goal here is simply to create fewer objects and reduce GC, which was not the case in earlier versionsByteBuffer nioBuffer = cachedNioBuffers ! =null ? cachedNioBuffers.pollLast() : null;
// Initialize PooledByteBuf
initBuf(buf, nioBuffer, handle, reqCapacity, threadCache);
return true;
}
Copy the code
Here is the source code, but the source code is more abstract, followed by another example (with figure).
1. Allocate memory addresses
private long allocateRun(int normCapacity) {
// Calculate which depth of nodes is required for capacity application
int d = maxOrder - (log2(normCapacity) - pageShifts);
// Find idle nodes by depth and mark them as unusable and update all parent nodes
// Return the allocated node ID (memoryMap subscript)
int id = allocateNode(d);
if (id < 0) {
return id;
}
// Remaining allocatable bytes -= space allocated this time
= 1 << (log2(16MB) -depthmap [id])
freeBytes -= runLength(id);
return id;
}
Copy the code
The key to seeallocateNodeMethod, which is also required by the logic that allocates Tiny and Small’s specifications (by implication, allocating Tiny and Small also requires occupying a page, a leaf node of the Chunk tree).
private int allocateNode(int d) {
// Start with the root
int id = 1;
If id&initial = 0, the number of tiers in which the ID is located is smaller than the number of target tiers
int initial = - (1 << d);
byte val = value(id);
// The value of the root node (id=1) is greater than the application depth
// Indicates that the Chunk tree has insufficient memory to allocate
if (val > d) {
return -1;
}
// val < d the memoryMap value of the current node is insufficient to d, and the descendant node should be searched until val=d
// (id & initial) == 0 indicates that the ID is not in tier D
// Continue the loop
while (val < d || (id & initial) == 0) {
// Go to the left child of the current node
id <<= 1;
// The memoryMap value of the left child node at the next level
val = value(id);
// If the left child node of the next layer has a memoryMap value greater than d, it is insufficient to allocate
// Go to its sibling, the right child of the next level
if (val > d) {
id ^= 1; val = value(id); }}// the id is assigned and marked as unavailable.
setValue(id, unusable);
// memoryMap values of all parent nodes of id
updateParentsAlloc(id);
return id;
}
Copy the code
The logic of this algorithm, which depends entirely on how to update the value of the entire tree once the node is found, is completely unreadable without looking at the updateParentsAlloc method. The updateParentsAlloc method iterates from bottom to top to update all the ancestor nodes of the ID node, with the parent node =Min (left child, right child). Note that the child node assigned to the node is not updated, because the allocateNode method search is top-down and updating the child node makes no sense.
private void updateParentsAlloc(int id) {
while (id > 1) {
// Parent node
int parentId = id >>> 1;
// The current node
byte val1 = value(id);
// Sibling nodes
byte val2 = value(id ^ 1);
// The current and sibling nodes are smaller
byte val = val1 < val2 ? val1 : val2;
// Sets the parent node to the small value of the child node
setValue(parentId, val);
// Continue iterating and set the ID to the parent nodeid = parentId; }}Copy the code
After looking at the updateParentsAlloc method, go back to the while loop in the allocateNode.
The d passed in here, on the one hand, represents the depth of the element, on the other hand, represents the original memory size of the d-layer element, such as d=11, which means 8K, and d=10, which means 16K. So let’s replace the bit operation in the code with the normal way of writing it.
// val < d the memoryMap value of the current node is insufficient to d, and the descendant node should be searched until val=d
// depth[id} < d indicates that the ID is not at layer D
// Continue the loop
while (val < d || depth[id} < d) {
// Go to the left child of the current node
id = id * 2;
// The memoryMap value of the left child node at the next level
val = value(id);
// If the left child node of the next layer has a memoryMap value greater than d, it is insufficient to allocate
// Go to its sibling, the right child of the next level
if (val > d) {
id = id + 1; val = value(id); }}Copy the code
So let’s do an example, let’s do an example, let’s do a legend.
Example: according to the order of 8KB, 8KB, 4MB memory allocation, here directly use after the standardization of memory specifications, not complicated.
@Test
public void test0(a) {
PooledByteBufAllocator allocator = (PooledByteBufAllocator) ByteBufAllocator.DEFAULT;
// 8KB
ByteBuf byteBuf = allocator.newDirectBuffer(8192, Integer.MAX_VALUE);
// 8KB
byteBuf = allocator.newDirectBuffer(8192, Integer.MAX_VALUE);
// 4MB
byteBuf = allocator.newDirectBuffer(4 * 1024 * 1024, Integer.MAX_VALUE);
}
Copy the code
The first step is to apply for 8K memory and calculate the node depth d = maxOrder – (log2(normCapacity) -pageshifts) = 11 – (log2(8192) -13) = 11. Enter allocateNode, execute while loop, do not enter if judgment inside, finally allocate to node ID =2048. Mark 2048 nodes unavailable (12) and update all ancestor nodes.
The second step is to apply for 8K memory again and calculate the node depth D =11. Enter the allocateNode and focus on the last while loop. The current node id of the loop is 1024. Note that val = 11 is already equal to the applied depth, but (id & initial) == 0 is not met, so the loop continues. The value of 2048 of the left child node is 12, which is larger than the application depth. If the judgment is entered, the right child node is transferred to the final ID =2049. Mark 2049 as unavailable (12) and update all ancestor nodes. Note that 1024 is also unavailable when updated to 12.
Step 3: apply for 4MB memory and calculate the node depth d = 11 – (log2(4194304) -13) = 2. Focus on the last while loop. The current node id=2 and memory=2, but (id & initial) == 0, continue the loop. In the loop, the left child node id=4, memory=3, greater than the depth of application 2, after the judgment of if to jump to the right child node, the final ID =5. Update ID =5 is not available and iterates to update its ancestor node. Note that all descendants of the id=5 node are not updated.
2. Initialize Buffer
By assigning an ID from the Chunk to a memoryMap in the previous step, you can calculate the subscript interval that actually uses the Buffer. For example, if you apply for 8K and allocate id=2048, the actual usage interval is [0,8192); if you apply for 8K and allocate id=2049, the actual usage interval is [8192,16384); if you apply for 4MB and allocate id=5, the actual usage interval is [4194304, 8388608).
The next step is to inject this key information into PooledByteBuf, called the initial Buffer, using the PoolChunk#initBuf method.
void initBuf(PooledByteBuf<T> buf, ByteBuffer nioBuffer, long handle, int reqCapacity,
PoolThreadCache threadCache) {
// The lower 32 bits of handle represent the memoryMap subscript
int memoryMapIdx = memoryMapIdx(handle);
// Handle high 32 bits for subpage, bitmap subscript; For page, this is 0
int bitmapIdx = bitmapIdx(handle);
if (bitmapIdx == 0) {
// Assign pages greater than 8K to go here
buf.init(this, nioBuffer, handle, runOffset(memoryMapIdx) + offset,
reqCapacity, runLength(memoryMapIdx), threadCache);
} else {
// Assign subpage to go hereinitBufWithSubpage(buf, nioBuffer, handle, bitmapIdx, reqCapacity, threadCache); }}Copy the code
Let’s first explain the method input.
- Buf: PooledByteBuf instance, an empty shell with no memory offset information, temporarily unavailable.
- NioBuffer: If direct memory is used, there may be an instance of ByteBuffer that is not empty, as in many of the previous scenarios, only a cache instance that exists to reduce new objects and GC. This instance is mainly used for the API operation of ByteBuffer. By changing several indexes in it, the same instance can be reused.
- Handle: This long value contains two pieces of information. The lower 32 bits represent the subscript of the memoryMap, that is, the node ID above, the higher 32 bits are 0 for memory allocations of 8K or more, and the bitmap subscript of the subpage for memory allocations of less than 8K. For the moment, I will focus on the lower 32 bits, the memoryMap subscripts.
- ReqCapacity: the original memory size applied for by a user (not standardized).
- ThreadCache: Ignore it.
Focus on the case where bitmapIdx=0, that is, the 8K-16MB memory size is allocated.
RunOffset (memoryMapIdx) + offset. This formula calculates that PooledByteBuf actually gets the starting value of the 16MB used interval, the offset. I don’t care about the bit logic.
private int runOffset(int id) {
int shift = id ^ 1 << depth(id);
return shift * runLength(id);
}
private int runLength(int id) {
return 1 << log2ChunkSize - depth(id);
}
Copy the code
RunLength (memoryMapIdx) then computes the length of the usage interval. As shown in the code above.
PooledByteBuf is ready to serve the public once the above key information is injected into it. Look at PooledByteBuf’s init method.
void init(PoolChunk<T> chunk, ByteBuffer nioBuffer,
long handle, int offset, int length, int maxLength, PoolThreadCache cache) {
init0(chunk, nioBuffer, handle, offset, length, maxLength, cache);
}
private void init0(PoolChunk<T> chunk, ByteBuffer nioBuffer,
long handle, int offset, int length, int maxLength, PoolThreadCache cache) {
this.chunk = chunk;
// ByteBuffer for direct memory and byte array for heap memory
memory = chunk.memory;
tmpNioBuf = nioBuffer;
allocator = chunk.arena.parent;
this.cache = cache;
// For the 8K specification only the lower 32 bits are useful for memoryMapIdx
this.handle = handle;
// runOffset(memoryMapIdx) + offset specifies the offset of a 16MB memory block
this.offset = offset;
// The user applies for the capacity
this.length = length;
// runLength(memoryMapIdx) Chunk Standardized capacity allocated to PooledByteBuf
this.maxLength = maxLength;
}
Copy the code
For a simple example, the client calls bytebuf.getByte (1) to read a byte with index 1. PooledDirectByteBuf actually converts the subscript 1 to offset+1, and returns the bytes at offset+1 from the 16MB block (memory variable, here ByteBuffer for direct memory).
@Override
protected byte _getByte(int index) {
// Get byte data from ByteBuffer at the specified offset
return memory.get(idx(index));
}
// Calculate the actual index relative to the 16MB memory block = allocate offset + input index
protected final int idx(int index) {
return offset + index;
}
Copy the code
Subpage allocates 16B-4KB memory
To allocate 16B to 4KB memory, allocate an 8KB memory block from the Chunk to a memoryMap leaf node whose id is 2048-4095, and Subpage the memory block. So the three steps highlighted in yellow are the same as allocating 8KB of memory, the allocateNode method.
Subpage The first step in memory allocation is to obtain the memoryMap leaf node subscript and the Subpage bitmap bitmap index. The two are returned together in a variable of type long called handle.
1. Allocate memory addresses
The entry point for allocating memory is also the Allocate method of PoolChunk, which was posted in Part 2 and is not repeated here. Go to step 1 PoolChunk#allocateSubpage.
private long allocateSubpage(int normCapacity) {
// Find the PoolSubpage header from arena
[]
PoolSubpage<T> head = arena.findSubpagePoolHead(normCapacity);
// If the memory is smaller than 8K, apply for 8K from Chunk and then Subpage will allocate the memory
// The requested depth must be the depth of the leaf node (d = 11)
int d = maxOrder;
synchronized (head) {
MemoryMap [id] = 12 and update all parent nodes
int id = allocateNode(d);
if (id < 0) {
return id;
}
final PoolSubpage<T>[] subpages = this.subpages;
final int pageSize = this.pageSize;
// Update the amount of memory available for allocation
freeBytes -= pageSize;
// Map id2048-4095 of a memoryMap to the corresponding subscripts 0-2047 in the PoolChunk Subpage array
int subpageIdx = subpageIdx(id);
PoolSubpage<T> subpage = subpages[subpageIdx];
// Create and initialize the Subpage
if (subpage == null) {
subpage = new PoolSubpage<T>(head, this, id, runOffset(id), pageSize, normCapacity);
subpages[subpageIdx] = subpage;
} else {
subpage.init(head, normCapacity);
}
// Assign the Subpage bitmap index and return handle
returnsubpage.allocate(); }}Copy the code
First look at the PoolSubpage constructor, which also contains the Subpage init method. This runOffset, as described in part 2 above, fetches the actual offset of the node ID to 16MB. PageSize is the length of the interval assigned to the Subpage. So Subpage is assigned to [runOffset(id), runOffset(id) + 8192).
PoolSubpage(PoolSubpage<T> head, PoolChunk<T> chunk, int memoryMapIdx, int runOffset, int pageSize, int elemSize) {
// Chunk to which Subpage belongs
this.chunk = chunk;
// The node ID assigned to the current Subpage in the Chunk tree, that is, the memoryMap index
this.memoryMapIdx = memoryMapIdx;
// The actual Subpage is allocated the 16MB memory offset
this.runOffset = runOffset;
// The page size is 8KB
this.pageSize = pageSize;
// The bitmap is only 8 bits long, as calculated by 8192>>>10
bitmap = new long[pageSize >>> 10];
/ / initialization
init(head, elemSize);
}
Copy the code
The above code is otherwise easy to understand. Why does bitmap only give 8 elements? Considering the minimum memory size is only 16B, 8K = 512 16B, so you need up to 512 binary bits, which is 8 longs.
ElemSize (Subpage), maxNumElems (maxNumElems), next available bitmap index (nextAvail), bitmapLength (bitmapLength), Finally, attach the current PoolSubpage to the head node of the corresponding spec pool of the Arena (I won’t go into detail here, but by simply inserting a linked list).
void init(PoolSubpage<T> head, int elemSize) {
doNotDestroy = true;
// Memory block size is standardized
this.elemSize = elemSize;
if(elemSize ! =0) {
// Maximum number of memory blocks = Number of remaining memory blocks = 8K/Size of memory blocks
maxNumElems = numAvail = pageSize / elemSize;
// Initializes the next available bitmap index
nextAvail = 0;
// Actual bitmap array length = maximum number of memory blocks / 2^6
bitmapLength = maxNumElems >>> 6;
// If the maximum number of memory blocks is less than 64, it is 0
if ((maxNumElems & 63) != 0) {
bitmapLength ++;
}
// Initialize bitmap array elements to 0 i.e. maxNumElems bits initialized to 0
for (int i = 0; i < bitmapLength; i ++) {
bitmap[i] = 0; }}// Mount the current Subpage to the head node of the corresponding specification pool of the Arena
addToPool(head);
}
// Add the current SubPage to the list after the head node (equivalent to the first, with an empty head node)
private void addToPool(PoolSubpage<T> head) {
assert prev == null && next == null;
prev = head;
next = head.next;
next.prev = this;
head.next = this;
}
Copy the code
The last step in allocating the memory address is to allocate the Subpage bitmap index and update the bitmap, and construct the Handle return. See allocate method.
long allocate(a) {
// Get the next available bitmap index, 0 when Subpage is first created
final int bitmapIdx = getNextAvail();
// Computes the subscripts in the bitmap array
int q = bitmapIdx >>> 6;
// Calculates the binary bits of the bitmap index in the bitmap array with subscript Q
int r = bitmapIdx & 63;
assert (bitmap[q] >>> r & 1) = =0;
// Bitmap index changed from 0 to 1
bitmap[q] |= 1L << r; // The r bit is denoted as 1
// The number of allocated memory blocks is reduced by one
if (-- numAvail == 0) {
// If all subpages have been allocated, they need to be removed from the corresponding size pool of Arena
removeFromPool();
}
/ / handle
return toHandle(bitmapIdx);
}
Copy the code
First, focus on the getNextAvail method, which finds the bitmap index for this memory allocation request. There are two kinds of logic depending on whether the nextAvail value is greater than or equal to zero.
- If greater than or equal to 0, use nextAvail to return directly and set nextAvail to -1. NextAvail =0; nextAvail=0; nextAvail=0; Second, when some memory is released and returned to Subpage, nextAvail will set it to the corresponding bitmap index. In this case, if the value is greater than 0, the freed bitmap index can be directly allocated.
- Less than 0, minus 1. In this case, the Subpage does not have a bitmap index that can be quickly allocated. Therefore, the Subpage needs to traverse the bitmap array, and then traverse the binary bits of the bitmap array elements to find the binary bits that are 0 and return them to the client. This part of the code will not be posted, you can think of it as a double loop to find the 0 binary bit.
private int getNextAvail(a) {
int nextAvail = this.nextAvail;
// If the next available index is greater than or equal to 0, take this value and set nextAvail to -1
if (nextAvail >= 0) {
this.nextAvail = -1;
return nextAvail;
}
// Otherwise you need to traverse the bitmap to find free bits allocated
return findNextAvail();
}
Copy the code
After the bitmapIdx is obtained, the bitmap bit is updated to 1 and the number of allocated memory blocks is decreased by one. If the allocated memory block is equal to 0, it needs to be removed from the corresponding specification pool of Arena. Finally, compute the handle value.
private long toHandle(int bitmapIdx) {
// 100000000000000000000000000000000000000000000000000000000000000 --- 0x4000000000000000L
/ / 000000000000000000000000000000000000000000000000000000000000000 - bitmapIdx < < 32 bitmapIdx = 0 when take id = 2048
/ / 000000000000000000000000000000000000000000000000000100000000000 - memoryMapIdx memoryMapIdx when take id = 2048 = 2048
return 0x4000000000000000L | (long) bitmapIdx << 32 | memoryMapIdx;
}
Copy the code
Since Java methods cannot return more than one output parameter, and there is no desire to encapsulate objects (reducing the number of new objects, even if object pools or cache instances could be used), the alternative is to encapsulate the Chunk and Subpage allocated offset information into a long return. The higher 32 bits are the bitmap index allocated by Subpage, and the lower 32 bits are the node ID (2048-4095) allocated by Chunk.
Why mark the first bit of the 64-bit binary as 1?
First, bitmapIdx and memoryMapIdx actually occupy only 31 bits, since bit 32 is a sign bit, which does not affect the accuracy of the data.
Second, whether 8K memory is allocated (allocateRun) or less than 8K memory (allocateSubpage), only one handle value can be returned to the external. In one case, the external cannot tell whether it is the small size memory allocated by Subpage or the page-size memory directly allocated by Chunk.
A) 8K memory is allocated, and 2048 memoryMap nodes are allocated. In this case, memoryMapIdx=0 and Subpage is not required for 8K memory, so bitmapIdx=0 and Handle is 0.
B) 32B size memory is allocated, and 2048 memoryMap nodes are allocated, so memoryMapIdx=0. Since 32B is the first memory block allocated by Subpage, bitmapIdx is also 0, and handle is also 0.
In both scenarios A and B, the returned external handle is 0, so it is impossible to distinguish the memory allocation at the Page level from that at the Subpage level.
Example: Allocate 32B, 32B, 496B memory.
@Test
public void test00(a) {
PooledByteBufAllocator allocator = (PooledByteBufAllocator) ByteBufAllocator.DEFAULT;
ByteBuf byteBuf = allocator.newDirectBuffer(32, Integer.MAX_VALUE);
ByteBuf byteBuf2 = allocator.newDirectBuffer(32, Integer.MAX_VALUE);
ByteBuf byteBufx = allocator.newDirectBuffer(496, Integer.MAX_VALUE);
}
Copy the code
So step one, we allocate 32B.
The Chunk section will allocate nodes with id=2048 and then create a PoolSubpage that will be added to the size pool of the Arena, forming a bidirectional linked list of nodes in the size pool, each time after the first node.Looking at the Subpage section, it assigns the bitmap index bitmapIdx=1 to the PooledByteBuf and updates the corresponding bitmap index to 1.
Step two, we redistribute 32B.
This is beyond the scope of the source code for this article. The user finds the 32B Subpage header from the Tiny pool corresponding to the Arena and goes directly to the Subpage that was just added to the list above. (Thread caching is not considered here, this will be discussed later, just to clarify the role of Subpage added to Arena size pool)
After quickly finding the Subpage, assign the bitmap index again, this time the bitmap index bitmapIdx=2 (the first bitmap element, the second binary bit), and update the corresponding bitmap index to 1.
Step three, we redistribute 496B.
Since 496B memory has never been allocated, 8KB of Chunk needs to be allocated first to create a Subpage. The node id is 2049. Then form a bidirectional linked list with 496B specifications in the specification pool.
Subpage only needs to use one element of the bitmap array this time to meet the requirements, and the calculation formula is as follows. Allocate bitmap index bitmapIdx=1 (the first bitmap element, the first binary bit). You can seeNetty is not free of memory fragmentation. Since 8192 can’t divide 496, there will be memory fragmentation.
2. Initialize Buffer
Unlike page-level initialbuffers, Subpage initialbuffers calculate offset and length logically.
void initBuf(PooledByteBuf<T> buf, ByteBuffer nioBuffer, long handle, int reqCapacity,
PoolThreadCache threadCache) {
// The lower 32 bits of handle represent the memoryMap subscript
int memoryMapIdx = memoryMapIdx(handle);
// Handle high 32 bits for subpage, bitmap subscript; For page, this is 0
int bitmapIdx = bitmapIdx(handle);
if (bitmapIdx == 0) {
/ /... omit
} else {
// Assign subpage to go hereinitBufWithSubpage(buf, nioBuffer, handle, bitmapIdx, reqCapacity, threadCache); }}private void initBufWithSubpage(PooledByteBuf<T> buf, ByteBuffer nioBuffer,
long handle, int bitmapIdx, int reqCapacity, PoolThreadCache threadCache) {
// Calculate the Chunk tree index according to handle
int memoryMapIdx = memoryMapIdx(handle);
// Map to the Subpage element of the subpages array according to the Chunk tree subscript
PoolSubpage<T> subpage = subpages[subpageIdx(memoryMapIdx)];
// Initialize PooledByteBuf
buf.init(
this, nioBuffer, handle,
runOffset(memoryMapIdx) + (bitmapIdx & 0x3FFFFFFF) * subpage.elemSize + offset,
reqCapacity, subpage.elemSize, threadCache);
}
Copy the code
RunOffset (memoryMapIdx) + (bitmapIdx & 0x3FFFFFFF) * subPage. ElemSize + offset, This is the offset represented by the bitmap index on top of the Page level. In addition, the length of the allocated memory is equal to the size of the Subpage.
Four, return the memory
PoolChunk’s free method, which is the entry point to return memory to Chunk or Subpage, is called by Arena->PoolChunkList->PoolChunk.
void free(long handle, ByteBuffer nioBuffer) {
// The lower 32 bits of handle are the subscripts of a memoryMap
int memoryMapIdx = memoryMapIdx(handle);
// The high 32 bits of handle are the bitmap index of the subpage
int bitmapIdx = bitmapIdx(handle);
// The bitmap index is not 0, indicating that the memory allocated by Subpage is returned to Subpage
if(bitmapIdx ! =0) {
PoolSubpage<T> subpage = subpages[subpageIdx(memoryMapIdx)];
PoolSubpage<T> head = arena.findSubpagePoolHead(subpage.elemSize);
synchronized (head) {
// Returns true, indicating that the subpage is still in use
// Return false to indicate that the subpage has been reclaimed from the Arena subpage pool and will no longer be used, so the current memoryMapIdx needs to be reclaimed to trunk
if (subpage.free(head, bitmapIdx & 0x3FFFFFFF)) {
return; }}}// Return the Chunk's memoryMap if the bitmap index is 0 or return to subpage fails (the subpage is no longer used for retrieval)
// Increase the available allocated bytes
freeBytes += runLength(memoryMapIdx);
// Set memoryMap[memoryMapIdx] = original value = depth[memoryMapIdx]
setValue(memoryMapIdx, depth(memoryMapIdx));
// Update nodes on memoryMap[memoryMapIdx] bottom-up
updateParentsFree(memoryMapIdx);
// ByteBuffer is cached for future use, which can be used by resetting the index of the ByteBuffer to reduce GC caused by frequent new objects
if(nioBuffer ! =null&& cachedNioBuffers ! =null&& cachedNioBuffers.size() < PooledByteBufAllocator.DEFAULT_MAX_CACHED_BYTEBUFFERS_PER_CHUNK) { cachedNioBuffers.offer(nioBuffer); }}Copy the code
The free method of Subpage returns memory to Subpage. Note the Boolean values returned here: true indicates that the memory block was successfully returned to Subpage, false indicates that the Subpage is no longer in use and has been reclaimed, requiring external return of 8K memory blocks to Chunk.
The criteria Subpage no longer uses are:
- NumAvail == maxNumElems: Indicates that the number of memory blocks that can be allocated is not equal to the maximum number of memory blocks that can be allocated. It indicates that some memory blocks are in external use.
- prev ! = next: Since poolSubPages are maintained in the Arena’s Subpage pool, each specification Subpage forms a linked list. If the prev of the current Subpage instance is equal to Next, that means there are only empty nodes and the current Subpage in the linked list. The subpage cannot be removed at this point. On the other hand, if prev is not equal to Next, it means that the current subpage is not the only node in the linked list and can be removed if numAvail == maxNumElems.
boolean free(PoolSubpage<T> head, int bitmapIdx) {
if (elemSize == 0) {
return true;
}
// Bitmap bits 0 indicate that memory blocks can be reallocated
int q = bitmapIdx >>> 6;
int r = bitmapIdx & 63;
bitmap[q] ^= 1L << r;
// Put the next optional bitmapIdx into the member variable, and the next time getNextAvail returns directly
setNextAvail(bitmapIdx);
// If the previously allocated memory block is 0, add it to the subpage pool and return true
if (numAvail ++ == 0) {
addToPool(head);
return true;
}
// If the allocated memory block is smaller than the total memory block, return true to indicate that subpage is being used
if(numAvail ! = maxNumElems) {return true;
} else {
// Although the allocatable memory block is equal to the total memory block, if the current subpage is the only node in the subpage pool, it will not be removed from the pool
if (prev == next) {
return true;
}
// If the current subpage is not the only node in the subpage pool and no one uses the memory block, it will be removed from the subpage pool. Return false to indicate that the subpage is no longer in use
doNotDestroy = false;
removeFromPool();
return false; }}Copy the code
conclusion
- Arena standardizes the size of memory requested by the user before entering the Chunk allocation. PoolArena#normalizeCapacity is responsible for memory size standardization.
- The value is greater than or equal to 16MB.
- If the value is greater than or equal to 512 bytes, the value is 2 to the NTH power. The value corresponds to Small and Page memory blocks.
- If the value is smaller than 512 bytes, the value is a multiple of 16, corresponding to all Tiny size memory blocks and 512-byte Small memory blocks. Why multiple of 16?
- Based on the standardized applied memory size, Chunk memory allocation is divided into two policies: Page level 8K or higher, and Subpage level 8K or lower. The allocation process is the same for both strategies: allocate the memory offset and length within the 16MB block, and then perform initialization of PooledByteBuf.
- Page level: The Chunk tree finds the required node ID from top to bottom and initializes PooledByteBuf.
- Subpage level: Find idle node ids from leaf nodes in the Chunk tree, obtain the 8K range, and create a Subpage. Subpage Divides the 8 KB memory segment into smaller segments. The length of the segment is the size of the memory size managed by Subpage. The offset of each memory segment is saved in a bitmap. Finally, the PooledByteBuf initialization is performed.