This series is mainly my notes on WASM research, which may be brief. The total includes:

  1. WebAssembly(1) Compilation
  2. The WebAssembly(2) Basic Api
  3. WebAssembly(3) Instructions
  4. WebAssembly(4) Validation
  5. WebAssembly(5) Memory
  6. WebAssembly(6) Binary Format
  7. WebAssembly(7) Future
  8. WebAssembly(8) Wasm in Rust (TODO)

Memory addressing

Convertions

  1. Real mode: Logical address = physical address
  2. Protection mode: Segmented + Paging (Option)
  3. Logical address: In C programming, the variable address value (&) can be read, which is actually the logical address, or the address returned by the malloc or new call. This address is relative to the current process data segment address, not the absolute physical address. Logical addresses are equal to physical addresses only in Intel real mode (because there is no segmentation or paging in real mode, the CPU does not perform automatic address translation). The application programmer only has to deal with logical addresses, while the segmentation and paging mechanisms are completely transparent to the average programmer and are covered only by the system programmer. Although application programmers can manipulate memory directly themselves, they can only operate in the memory segment assigned to you by the operating system. A logical address is represented by a segment identifier plus an offset relative to the address in the specified segment, which is [segment identifier: intra-segment offset].
  4. Linear address: the intermediate layer between logical address and physical address transformation. The program code generates the logical address, or offset address in the segment, which is added to the base address of the corresponding segment to generate a linear address. If paging is enabled, linear addresses can then be transformed to produce a physical address. If paging is not enabled, linear addresses are simply physical addresses. The Intel 80386 has a linear address space capacity of 4G (2 ^ 32 = 32 root address bus addressing).
  5. Physical Address refers to the Address signal that addresses the Physical memory on the external Address bus of the CPU. It is the final Address of Address transformation. If paging is enabled, linear addresses are converted to physical addresses using items in the page directory and page table. If paging is not enabled, linear addresses become physical addresses, for example in real mode.

The memory block

Modern memory addressing mechanisms have introduced the concept of “segmentation” : different levels of programs, with different data types of programs, are stored in different segments, and offsets are defined on the segments. That is, modern programs do not see linear addresses, but rather segmented addresses, such as segmentSelector:offset. The space composed of these segments and offsets is the logical memory space; These tuples, these are the logical addresses.

A segment identifier consists of a 16-bit field, also known as a segment Selector. The processor provides segment registers to store the segment identifier. There are six types of segment registers:

  1. csA code segment register pointing to the segment containing program instructions;
  2. ssA stack register pointing to the segment containing the current program;
  3. dsA data segment register pointing to a segment containing static or global data;
  4. The other three registerses.fs.gsCalled the add-on segment register, for general use, it can point to any data segment

Segment registers do not store segment base addresses or segment descriptors. Segment details need to be obtained from Descriptor tables via selectors. This can speed up the query, but also to control the permissions, only the access level of the program can successfully get the segment base address and length

The 16-bit selector consists of:

  1. Bit 0-1: access permissions, 0 highest (kernel-mode), 3 lowest (all accessible), low permission requests cannot access the high permission memory
  2. Bit 2: segment table type, 0 – Register corresponding to Global DT (GDT); 1 – Local segment table (Local DT, LDT) corresponding registerldtrDT is similar to an array, with each entry taking up 8 bytes
  3. Bits 3-15: Index information for the segment table descriptor from which the segment table descriptor can be obtained

The structure of segment table descriptors is complex, but the most important is the segment BASE (32bit) and segment boundary (20bit).

After obtaining BASE from the segment descriptor and adding it to the logical address offset, we get the linear address:

paging

Paging is enabled when G = 1 in the segment descriptor.

Paging is the artificial logical partitioning of contiguous memory space into fixed size segments. For linear memory, the fixed size of this shard is called a page (); In the case of physical memory, the fixed size of this shard is called a page frame (). The paging mechanism divides linear memory into pages and physical memory into frames, and establishes a mapping relationship from page to frame. This mapping is a many-to-one mapping. Conversion of linear addresses is done in two steps, each based on a conversion table, the first type of conversion table is called the page table conversion, the second type of conversion is called the page table conversion. The purpose of using this secondary mode is to reduce the amount of RAM required for each process page table. Just like we have a list of books to read, convenient and quick. The specific conversion is shown in the figure below:

Addressing Process (386) :

  1. Read a selector in a segment register;
  2. Verify access level (protected mode) — Pass;
  3. According to the segment table type and the segment table position index, the descriptor in the segment table is read.
  4. Check access level bits (protected mode) — pass;
  5. Check offset to see if the offset exceeds the segment limit;
  6. Check the P bit to make sure the target location is available in physical memory;
  7. Splicing segment base address and offset into linear address;
  8. Check for cache hits — no hits;
  9. According to the linear address up to 10 bits, read the first-level page table;
  10. Level 1 page table check access level — pass;
  11. The secondary page table position of 20 + 12 + 0 is obtained.
  12. Access secondary page table based on secondary page table location;
  13. According to the middle 10 bits of linear address, read the secondary page table;
  14. Secondary page table check access level — pass;
  15. Get 20 bit frame base address;
  16. Concatenated with the low 12 bits of the linear address to a 32-bit physical address;
  17. To fetch.

Benefits of virtual memory (VAS) :

  1. Shielding the bottom layer: virtual address programming more convenient
  2. Permission control: Resolves the problem of illegal memory access
  3. Efficient: Virtual memory addressing space can be larger than physical memory. Flexible allocation of (cache, LRU) kernel segmentation courses. PDF

Memory Segmentation coincides with paging, so many new architectures or OS tend to use Flat Segmentation, such as x86-64 and Linux

There are four segments subdivided in Linux:

All user processes use the same user code segment descriptor and user data segment descriptor, which are __USER_CS and __USER_DS. That is, each process in user mode has the same value in its CS register and DS register. When any process or interrupt exception enters the kernel, the same kernel code segment descriptors and kernel data segment descriptors are used, which are __KERNEL_CS and __KERNEL_DS. It is important to remember that the kernel data segment is actually the kernel stack segment. The logical address is derived from the segment selector (16 bits) + the segment offset(32 bits). As mentioned earlier, only in user mode, the CS and DS registers are __USER_CS and __USER_DS. The values in the CS and DS registers are __KERNEL_CS and __KERNEL_DS as long as they are in kernel state. In our programming, the address provided is actually an offset, and the system will automatically combine this offset with the segment selector in CS. That is, the logical address we use actually only uses the offset segment, the segment selector is empty. The BASE of the four segment descriptors is 0x00000000. It is also shown that when the logical address is converted to a linear address through such segmentation mechanism, there is no change in fact, that is, the logical address = linear address (in fact, the two addresses are the values of offset).

Virtual memory management for WASM

In addition to memory segmentation management, applications also have the concept of segments, which describe the organization of data by a program. A typical Linux program has the following sections:

The wASM stack can be simplified as:

fromGrow low and pile fromIt grows higher, because the stack is placed first so you need to give it a maximum value at compile time.


Stack space can be set as follows:

clang \
--target=wasm32 \
-O3 \
-flto \
-nostdlib \
-Wl,--no-entry \
-Wl,--export-all \
-Wl,--lto-O3 \
-Wl,-z,stack-size=$[8 * 1024 * 1024] \ # Set maximum stack size to 8MiB
-o add.wasm \
add.c
Copy the code
// LLVM:
// <https://github.com/llvm-mirror/lld/blob/master/wasm/Driver.cpp#L355>
Config->InitialMemory = args::getInteger(Args, OPT_initial_memory, 0);
Config->GlobalBase = args::getInteger(Args, OPT_global_base, 1024);
Config->MaxMemory = args::getInteger(Args, OPT_max_memory, 0);
Config->ZStackSize =
      args::getZOptionValue(Args, OPT_z, "stack-size", WasmPageSize);
/ /...
Copy the code

Refer: dassur. Ma/things/c – to…

Memory alignment

If the effective address of a memory access is a multiple of the alignment attribute value of the memory access, the memory access is considered aligned, otherwise it is considered misaligned. Aligned and misaligned accesses have the same behavior

A memory access is said to be aligned if its Effective address is a multiple of the alignment property of the packet access, otherwise it is unaligned. Aligned and unaligned accesses have the same behavior, but alignment increases CPU processing speed.

Wasm32’s alignment property is 32, and WASM64’s native alignment property is 64

Effective Address

That is, the real address currently accessed (relative to offset)


i32.const 3       ;; address_operand = 3
i64.const 1234    ;; value
i64.store16 1 3   ;; alignment=1, offset=3, effective_address = 3 + 3 = 6
Copy the code

The above is aligned, but if:

i32.const 3       ;; address_operand = 3
i64.const 1234    ;; value
i64.store16 2 3   ;; alignment=2
Copy the code

Then there will be a mismatch: