The Master said, “I am not generous in my position, I am disrespectful for the sake of ceremony, and I am not sorry for my loss. Why should I look upon it?” “The Analects of Confucius” : eight yi
A hundred blog series. This is: the v14. Xx HongMeng kernel source code analysis (memory assembly) | who is the foundation of virtual memory implementation
Memory management:
- V11. Xx HongMeng kernel source code analysis (memory allocation) | what memory allocation
- V12. Xx HongMeng kernel source code analysis (memory management) | what is virtual memory panorama
- V14. Xx HongMeng kernel source code analysis (memory assembly) | who is the foundation of virtual memory implementation
- V15. Xx HongMeng kernel source code analysis (memory mapping) | virtual memory deficiency in where
- V16. Xx HongMeng kernel source code analysis rules (memory) | what memory management in the tube
- V17. Xx HongMeng kernel source code analysis (physical memory) | how to manage physical memory
Arm-cp15 coprocessor
The ARM processor uses registers in coprocessor 15(CP15) to control cache, TCM, and memory management. CP15 registers can be accessed only by MRC and MCR commands. CP15 registers contain 16 32-bit registers numbered from 0 to 15. This paper focuses on three registers C7, C2 and C13.
Start by disassembling a piece of assembly code
Look at section assembly, read the kernel source code will not point assembly is not, but do not fear, not so scary, from shallow to deep, the kernel is actually quite fun. See arm.H. It’s full of these things.
#define DSB __asm__ volatile("dsb": : :"memory")
#define ISB __asm__ volatile("isb": : :"memory")
#define DMB __asm__ volatile("dmb": : :"memory")
STATIC INLINE VOID OsArmWriteBpiallis(UINT32 val)
{
__asm__ volatile("MCR p15, 0, %0, c7, c1, 6": :"r"(val));
__asm__ volatile("isb": : :"memory");
}
Copy the code
Write the ARM register R0 to register 7 in CP15, and pass the value in.
For example, OsArmWriteBpiallis(0) does four actions
1. Write 0 to R0 register ::”r”(val) means to declare to the compiler that the R0 register will be changed. In fact, compilers are very powerful, and not just a tool for compiling code, as is commonly thought.
2. Volatile also tells the compiler not to optimize the code, generating the target instruction as it is.
3.” ISB “::: “memory” still tells the compiler that the contents of memory may have changed, and that it needs to invalidate all caches and access the actual contents, not the Cache!
4. Write the value of R0 to C7, which is the register of CP15 coprocessor. What is the C7 register responsible for? Refer to the table below.
What registers does CP15 have
Turn off caching and write caching controls! Other parts of the register will be covered below, to get a general idea.
Where does the MMU obtain page table information? The answer is: TTB
TTB register (Translation table Base)
According to the above table, TTB register is the C2 register of CP15 coprocessor, which stores the base address of the page table, that is, the base address of the first-level mapping descriptor table. The following read functions are provided around TTB. To put it simply, the kernel constantly changes and reads the register value from the outside, while THE MMU will only read the register value directly through the hardware, so that the MMU can obtain a different page table for process virtual address and physical address conversion. Remember? The page table for each process is independent!
So when do you change the values inside? The feed table means that the MMU is doing a context switch! Let’s just look at the code.
Mmu context
It’s only called by this one function. LOS_ArchMmuContextSwitch is without a doubt the key function.
typedef struct ArchMmu {
LosMux mtx; /**< arch mmu page table entry modification mutex lock */
VADDR_T *virtTtb; /**< translation table base virtual addr */
PADDR_T physTtb; /**< translation table base phys addr */
UINT32 asid; /**< TLB asid */
LOS_DL_LIST ptList; /**< page table vm page list */
} LosArchMmu;
// MMU context switch
VOID LOS_ArchMmuContextSwitch(LosArchMmu *archMmu)
{
UINT32 ttbr;
UINT32 ttbcr = OsArmReadTtbcr(a);// Read the status of the TTB register
if (archMmu) {
ttbr = MMU_TTBRx_FLAGS | (archMmu->physTtb);// Process TTB physical address value
/* enable TTBR0 */
ttbcr &= ~MMU_DESCRIPTOR_TTBCR_PD0;/ / can make TTBR0
} else {
ttbr = 0;
/* disable TTBR0 */
ttbcr |= MMU_DESCRIPTOR_TTBCR_PD0;
}
/* From armv7a arm B3.10.4, we should do synchronization changes of ASID and TTBR
OsArmWriteContextidr(LOS_GetKVmSpace()->archMmu.asid);// Cut the asID to the ID of the kernel space
ISB;
OsArmWriteTtbr0(ttbr);// Write the process page base address to TTB via register r0
ISB;
OsArmWriteTtbcr(ttbcr);// Write the TTB status bit
ISB;
if (archMmu) {
OsArmWriteContextidr(archMmu->asid);// Write the process identifier to register C13 via register R0ISB; }}// c13 Adress Space ID (ASID) Process ID
STATIC INLINE VOID OsArmWriteContextidr(UINT32 val)
{
__asm__ volatile("MCR p15, 0, %0, C13, c0, 1": :"r"(val));
__asm__ volatile("isb": : :"memory");
}
Copy the code
Take a look at where LOS_ArchMmuContextSwitch is called.
There are four places where the MMU context is switched
First: through the scheduling algorithm, the space of the selected process changes, the natural mapping page table changes with it, need to switch the MMU context, or directly look at the code. Code is not a lot, they are posted, are annotated, do not remember the scheduling algorithm can go to the series to see the hongmeng kernel source code analysis (scheduling mechanism), there are detailed elaboration.
// Scheduling algorithm - process switching
STATIC VOID OsSchedSwitchProcess(LosProcessCB runProcess, LosProcessCB newProcess)
{
if (runProcess == newProcess) {
return;
}
#if (LOSCFG_KERNEL_SMP == YES)
runProcess->processStatus = OS_PROCESS_RUNTASK_COUNT_DEC(runProcess->processStatus);
newProcess->processStatus = OS_PROCESS_RUNTASK_COUNT_ADD(newProcess->processStatus);
LOS_ASSERT(! (OS_PROCESS_GET_RUNTASK_COUNT(newProcess->processStatus) > LOSCFG_KERNEL_CORE_NUM));
if (OS_PROCESS_GET_RUNTASK_COUNT(runProcess->processStatus) == 0) {// Get the number of tasks for the current process
#endif
runProcess->processStatus &= ~OS_PROCESS_STATUS_RUNNING;
if ((runProcess->threadNumber > 1) && !(runProcess->processStatus & OS_PROCESS_STATUS_READY)) {
runProcess->processStatus |= OS_PROCESS_STATUS_PEND;
}
#if (LOSCFG_KERNEL_SMP == YES)
}
#endif
LOS_ASSERT(! (newProcess->processStatus & OS_PROCESS_STATUS_PEND));// Assert that the process is not blocked
newProcess->processStatus |= OS_PROCESS_STATUS_RUNNING;// Set the process state to running
if (OsProcessIsUserMode(newProcess)) {// Switch process MMU context in user mode
LOS_ArchMmuContextSwitch(&newProcess->vmSpace->archMmu);// New process -> Virtual space ->Mmu part entry parameter
}
#ifdef LOSCFG_KERNEL_CPUP
OsProcessCycleEndStart(newProcess - > processID,OS_PROCESS_GET_RUNTASK_COUNT(runProcess->processStatus) + 1);
#endif /* LOSCFG_KERNEL_CPUP */
OsCurrProcessSet(newProcess);// Set the process to g_runProcess
if ((newProcess->timeSlice == 0) && (newProcess->policy == LOS_SCHED_RR)) {// Allocate time slices to run out or to the initial process
newProcess->timeSlice = OS_PROCESS_SCHED_RR_INTERVAL;// Reassign time slices, default 20ms}}Copy the code
There are two context switches: mmU context switch and CPU context switch due to task switch.
Second: it is the time to load ELF files will switch mmU, a new process was born, specific will be in the Hongmeng kernel source code analysis (boot loading) will be detailed, please pay attention to the series of dynamic.
While the rest of the virtual space is reclaiming and refreshing the space, you’ll see the code for yourself.
How does mmU quickly find physical addresses from virtual addresses? TLB = TLB = cache = TLB = register = cache = cache
TLB (Translation lookaside Buffer)
TLB is a cache on hardware. Page tables are usually large and stored in the memory. Therefore, after the MMU is introduced, the processor accesses the memory twice to read instructions and data. To reduce processor performance degradation due to MMU, TLB, which translates as “address translation backup buffer,” or simply “fast table,” was introduced. Simply put, TLB is a Cache of page tables, which stores a copy of some of the page table entries that are currently most likely to be accessed. The page table is queried in memory only when the TLB is unable to complete the address translation task, thus reducing the processor performance degradation caused by page table query. Detailed look at
Here’s how it works.
1. The base address of the page table in the figure is the value of the TTB register. The whole page table is very large, and how large it is will be discussed later.
2. The virtual address is the address of the program. The logical address, that is, the address fed to the CPU, must be converted by the MMU into physical memory before real instructions and data can be obtained.
3. The TLB is the mini-version of page Table. The MMU first searches for physical pages from the TLB and then searches for physical pages from the Page Table. Because there are many Page tables belonging to processes, and only one TLB, if you do not add the page tables, multiple processes will map to the same physical page box without realizing it. A physical page can be mapped by only one page table at a time. But in addition to the uniqueness of the TLB, one thing is needed to make the mess good: the unique identifier of the process at the mapping level – asID.
Asid register
Adress Space ID (ASID) Process identifier. It belongs to register C13 of the CP15 coprocessor. The ASID can uniquely identify a process and provide address Space protection for the process. When TLB attempts to resolve the virtual page number, it ensures that the ASID of the currently running process matches the ASID associated with the virtual page. If they do not match, they fail as TLBS. In addition to providing address space protection, the ASID allows the TLB to contain entries for multiple processes at the same time. If TLB does not support independent ASids, each time a page table is selected (for example, during a context switch), the TLB must be flushed or deleted to ensure that the next process does not use the wrong address translation.
There is a bit in the TLB page table that specifies whether the current entry is global(nG=0, accessible to all processes) or non-global(nG=1, accessible only to this process). If the TLB type is global, the TAG ASID is not included in the TLB. If the entry type is non-global, the TLB tags the ASID, and the MMU needs to check whether the ASID is consistent with the ASID of the current process when querying the ENTRY in the TLB. Only the ASID is consistent, the current process has access to this entry.
See? If every MMU context switch, the TLB is refreshed to ensure that the TLB is all the mapping table of the new process, yes, but the efficiency is too low!! Process switching is actually sub-second, how frequent the address conversion is, how can it be so realistic, the reality is that the TLB has many, many other processes occupied by the physical memory record, of course, they also use the physical memory. So when the application is new 10 m memory that belongs to oneself, actually at the kernel level doesn’t belong to you, or someone else, only you in a moment of the 1 m really 1 m physical memory only belong to you, and when you switch process by other processes, most likely you are using the 1 m has not in physical memory, It’s been replaced on the hard drive. Is that clear? Those of you who are only interested in application development can say that this is none of my business, and it will make me feel good, but those of you who are familiar with the kernel have to understand that this is happening every second.
I’ll leave you with the last function, how are ASids assigned?
/* allocate and free asid */
status_t OsAllocAsid(UINT32 *asid)
{
UINT32 flags;
LOS_SpinLockSave(& g_cpuAsidLock, & flags); UINT32 firstZeroBit =LOS_BitmapFfz(g_asidPool,1UL << MMU_ARM_ASID_BITS);
if (firstZeroBit >= 0 && firstZeroBit < (1UL << MMU_ARM_ASID_BITS)) {
LOS_BitmapSetNBits(g_asidPool firstZeroBit,1);
*asid = firstZeroBit;
LOS_SpinUnlockRestore(& g_cpuAsidLock, flags);return LOS_OK;
}
LOS_SpinUnlockRestore(& g_cpuAsidLock, flags);return firstZeroBit;
}
Copy the code
Intensive reading of the kernel source code
Four code stores synchronous annotation kernel source code, >> view the Gitee repository
Analysis of 100 blogs. Dig deep into the core
Add comments to hongmeng kernel source code process, sort out the following article. Content based on the source code, often in life scene analogy as much as possible into the kernel knowledge of a scene, with a pictorial sense, easy to understand memory. It’s important to speak in a way that others can understand! The 100 blogs are by no means a bunch of ridiculously difficult concepts being put forward by Baidu. That’s not interesting. More hope to make the kernel become lifelike, feel more intimate. It’s hard, it’s hard, but there’s no turning back. 😛 and code bugs need to be constantly debug, there will be many mistakes and omissions in the article and annotation content, please forgive, but will be repeatedly amended, continuous update. Xx represents the number of modifications, refined, concise and comprehensive, and strive to create high-quality content.
Compile build | The fundamental tools | Loading operation | Process management |
---|---|---|---|
Compile environment The build process Environment script Build tools Designed.the gn application Ninja ninja |
Two-way linked list Bitmap management In the stack way The timer Atomic operation Time management |
The ELF format The ELF parsing Static link relocation Process image |
Process management Process concept Fork Special process Process recycling Signal production Signal consumption Shell editor Shell parsing |
Process of communication | Memory management | Ins and outs | Task management |
spinlocks The mutex Process of communication A semaphore Incident control The message queue |
Memory allocation Memory management Memory assembly The memory mapping Rules of memory Physical memory |
Total directory Scheduling the story Main memory slave The source code comments Source structure Static site |
The clock task Task scheduling Task management The scheduling queue Scheduling mechanism Thread concept Concurrent parallel The system calls Task switching |
The file system | Hardware architecture | ||
File concept The file system The index node Mount the directory Root file system Character device VFS File handle Pipeline file |
Compilation basis Assembly and the cords Working mode register Anomaly over Assembly summary Interrupt switch Interrupt concept Interrupt management |
HongMeng station | into a little bit every day, the original is not easy, welcome to reprint, please indicate the source.