The Master said, “Believe earnestly in learning, and abide by death and be good. Dangerous countries do not enter, disorderly countries do not live, the world is seen, no way is hidden. The state has its way, poor and cheap yan, shame also. A state without tao, rich and expensive yan, shame also.” The Analects of Confucius: Tabor

A hundred blog series. This is: the v26. Xx HongMeng kernel source code analysis (spinlocks) | when chastity memorial arch good comrade

Process communication related articles are:

  • V26. Xx HongMeng kernel source code analysis (spinlocks) | when chastity memorial arch good comrade

  • V27. Xx HongMeng kernel source code analysis (mutex) | than full spinlock mutex

  • V28. Xx HongMeng kernel source code analysis (process) | took nine interprocess communication speed

  • V29. Xx HongMeng kernel source code analysis (semaphore) | who is responsible for solving task synchronization

  • V30. Xx HongMeng kernel source code analysis control (events) | intertask synchronization scheme of many-to-many

  • V33. Xx HongMeng kernel source code analysis (message queue) | how to asynchronous communication between processes the data

This article explains the spin lock

It is recommended to read the process/thread series before reading this article.

Where are spinlocks used in the kernel? Look at the picture:

An overview of the

As the name suggests, a spin lock is an automatic rotation lock, which is similar to the lock in the toilet. Before entering the cubicle, the sign is green and available. After entering the cubicle, the inside lock turns in a circle and the outside sign turns red to indicate that it is in use and the outside one has to wait. This is a figurative metaphor, but it is also true.

In a multi-CPU core environment, the same memory space is used to access the same resource. Therefore, a mutually exclusive access mechanism is needed to ensure that only one core is operating at the same time. Spin-lock is such a mechanism.

  • Spin lock means that when a thread is acquiring a lock, if the lock has been acquired by another CPU thread, the thread will loop and continuously judge whether the lock can be successfully acquired until the other CPU releases the lock, and then the CPU will exit the loop.

  • The idea behind a spin lock is that it can only be held for a very short time, the lock can only be held by one task, and the CPU holding the spin lock cannot go into sleep mode because other cpus are waiting for the lock. Context swapping is also not allowed to prevent deadlocks, and scheduling is prohibited.

  • Spin locks are similar to mutex in that they address the problem of mutually exclusive use of shared resources. Either a mutex or a spin lock can have at most one owner at any time. However, the scheduling mechanism is slightly different. For a mutex, if the lock is already occupied, the lock applicant will be blocked. But the spin lock does not cause the caller to block, and the loop checks to see if the spin lock has been released.

While both are shared resource competitions, spin locks emphasize competition between CPU cores, while mutex emphasizes competition between tasks (including the same CPU core).

What does a spin lock look like?

    typedef struct Spinlock {// Spin lock structure
        size_t      rawLock;/ / the original lock
    #if (LOSCFG_KERNEL_SMP_LOCKDEP == YES) // Switch of the deadlock detection module
        UINT32      cpuid; // The CPU that holds the lock
        VOID        *owner; // Hold the lock task
        const CHAR  *name; / / lock name
    #endif
    } SPIN_LOCK_S;
Copy the code

The structure is simple and contains a macro for deadlock detection, which is turned off by default. So the only real variable used is rawLock. But C language code can not find the variable change process, but through a piece of assembly code to achieve. By the end of this article, you’ll see that spin-locking can only be done in assembly code.

Spin lock usage flow

Spin-locks are used in the case of multiple CPU cores to solve the problem of competing cpus for resources. The process is simple, three steps.

  • Create a spin lock: use LOS_SpinInit to initialize a spin lock, or SPIN_LOCK_INIT to initialize a spin lock for static memory.

  • LOS_SpinLock LOS_SpinTrylock LOS_SpinLockSave Apply for the specified spin lock. If the application is successful, continue to execute the lock protection code. Failed application is busy waiting in the spin lock application until the spin lock application is completed.

  • Release a spin lock: Use the LOS_SpinUnlock LOS_SpinUnlockRestore interface to release a spin lock. After the lock protection code is executed, the corresponding spin lock is released so that other cores can apply for spin locks.

Several key functions

The spinlock module is implemented by inline functions, seen in los_spinlock.h code is not much, mainly three functions.

ArchSpinLock(&lock->rawLock);
ArchSpinTrylock(&lock->rawLock)
ArchSpinUnlock(&lock->rawLock);
Copy the code

You can say that mastering them is mastering the spin lock, but all three functions are implemented by assembly. See in the los_dispatch.s file. Assembly code has already been covered in two articles in this series, so it’s easy to understand these three pieces of code. The arguments to the function are recorded by r0, i.e. r0 holds the address of lock->rawLock, lock->rawLock, lock->rawLock.

ArchSpinLock assembly code

    FUNCTION(ArchSpinLock) @hold on to lock mov R1, #1      @r1=1
    1: @ Function of the loop, since SEV is a broadcast event. [r0] @r0 = &lock->rawLock CMP r2, #0@ r2 and0Strexeq r2, R1, [r0]@ At this time, the CPU is awakened, try lock->rawLock=1, r2=0Cmpeq r2, #0At sign and let's see if R2 is equal to0If it is equal, the lock BNE is obtained1bIf @ is not equal, continue to enter the loop DMB @ with DMB instruction to isolate, to ensure that the buffer data has been implemented in RAM BXLR @ must be locked, jump back to call ArchSpinLock functionCopy the code

Read this piece of assembly code to understand the real mechanism of the implementation of the spin lock, why must be implemented in assembly. Since the CPU would rather sleep than lock, note that the thread is not put to sleep, but the CPU can be put to sleep only by assembly. C simply can’t write code that makes the CPU really sleep.

ArchSpinTrylock Assembly code

It’s impossible to know the real difference between ArchSpinTrylock and ArchSpinLock without reading the following piece of assembly code.

    FUNCTION(ArchSpinTrylock) @ Try to get the lock, if you can't get it, remove it1          @r1=1Mov r2, r0 @r2 = r0 ldrex r0, [r2] @r2 = &lock->rawLock, r0 = lock->rawLock CMP r0, #0@ r0 and0Strexeq r0, r1, [r2] @ try lock->rawLock=1, r0=0Otherwise, r0 =1DMB @ Data store isolation to ensure that cached data has been implemented into RAM Bx LR @ jumps back to call ArchSpinLock functionCopy the code

ArchSpinTrylock will sleep and wait until its husband (lock->rawLock = 0) returns. I really want to give ArchSpinLock a chastity archway!

ArchSpinUnlock Assembly code

    FUNCTION(ArchSpinUnlock) @release lock mov R1, #0          @r1=0DMB @ Data store isolation to ensure that cached data has been implemented into RAM STR R1, [r0] @ command lock->rawLock =0DSB @ data synchronization isolation SEv @ broadcast events to each CPU, wake up the sleeping CPU Bx lr @ jump back to call ArchSpinLock functionCopy the code

The code involves several uncommon assembly instructions, one by one:

Assembles instruction WFI/WFE/SEV

WFI(Wait for Interrupt): instruction to Wait for an interrupt. WFI is generally used for CPUIDLE, where the WFI instruction is that the processor does not need to do anything until an interrupt or similar exception occurs.

Each CPU has its own idle task. When the CPU has nothing to do, it stays in it. There is an infinite loop guarding the WFI instruction, which triggers the CPU to get up and work. Interrupt is divided into hard interrupt and soft interrupt, system call is realized through soft interrupt, and device class is hard interrupt, can trigger CPU work. To see what happens when the CPU is idle, the code is super simple:

LITE_OS_SEC_TEXT WEAK VOID OsIdleTask(VOID) // The CPU stays here when it has nothing to do
{
    while (1) {// There is only an infinite loop
        Wfi(a);//WFI command: Arm core immediately enters low-power standby state, waits for interruption, and enters hibernation mode.}}Copy the code

WFE(Wait for event) command does not need to perform any operation before the SEV command generates the event. Therefore, the WFE command must be followed by a CORRESPONDING SEV command to wake up the event. A typical use scenario for WFE is in spinlock, where spinlock protects shared resources between different CPU cores. The process for using WFE is:

  • Resources are free at the beginning

  • CPU core 1 accesses resources, holds a lock, and obtains resources

  • CPU core 2 accesses a resource that is not idle, executes the WFE instruction, and puts core into low-power state(sleep).

  • CPU core 1 releases resources, releases locks, and releases resources while executing SEV to wake CPU core 2

  • CPU core 2 gets the resource

In addition, the old spin lock, when the resource is not available, the CPU core into an infinite loop, and by inserting WFE instructions, the power consumption is greatly saved.

Send Event (SEV): sends event instructions. SEV is a broadcast instruction that sends events to all processors in a multi-processor system to wake up sleeping cpus.

The implementation of SEV and WFE is much like the observer pattern of design patterns.

Assembly instruction LDREX/STREX

LDREX is used to read values in memory and mark exclusive access to that segment of memory:

LDREX Rx, [Ry] Read the 4-byte memory value pointed to by register Ry, save it to register Rx, and mark exclusive access to the memory region pointed to by register Ry.

If the LDREX command is marked as exclusive access, the command execution will not be affected.

STREX updates the memory value by checking whether the segment of memory has been marked as exclusive access and deciding whether to update the memory value:

STREX Rx, Ry, [Rz] If this instruction is marked as exclusive access, the value in register Ry is updated to the memory pointed to by register Rz and register Rx is set to 0. After successful instruction execution, the exclusive access marker bit is cleared.

If this instruction is executed and the exclusive flag is not set, the memory is not updated and the register Rx is set to 1.

Once a STREX instruction is successfully executed, when the STREX instruction is used to update the same segment of memory in the future, the exclusive marker will be found to have been cleared, and the exclusive access mechanism will be realized.

Programming instance

This example implements the following flow.

  • The task Example_TaskEntry initializes the spin lock and creates two tasks Example_SpinTask1 and Example_SpinTask2, each running on two cores.

  • In both Example_SpinTask1 and Example_SpinTask2, applying for a spin lock is performed. To simulate actual operations, the spin lock is held, delayed, and finally released.

  • After 300Tick, task Example_TaskEntry is scheduled and run, and tasks Example_SpinTask1 and Example_SpinTask2 are deleted.

#include "los_spinlock.h"
#include "los_task.h"

/* Spinlock handle id */
SPIN_LOCK_S g_testSpinlock;
/* Task ID */
UINT32 g_testTaskId01;
UINT32 g_testTaskId02;

VOID Example_SpinTask1(VOID)
{
    UINT32 i;
    UINTPTR intSave;

    /* Request spinlock */
    dprintf("task1 try to get spinlock\n");
    LOS_SpinLockSave(& g_testSpinlock, & intSave);dprintf("task1 got spinlock\n");
    for(i = 0; i < 5000; i++) {
        asm volatile("nop");
    }

    /* Release the spin lock */
    dprintf("task1 release spinlock\n");
    LOS_SpinUnlockRestore(&g_testSpinlock, intSave);

    return;
}

VOID Example_SpinTask2(VOID)
{
    UINT32 i;
    UINTPTR intSave;

    /* Request spinlock */
    dprintf("task2 try to get spinlock\n");
    LOS_SpinLockSave(& g_testSpinlock, & intSave);dprintf("task2 got spinlock\n");
    for(i = 0; i < 5000; i++) {
        asm volatile("nop");
    }

    /* Release the spin lock */
    dprintf("task2 release spinlock\n");
    LOS_SpinUnlockRestore(&g_testSpinlock, intSave);

    return;
}

UINT32 Example_TaskEntry(VOID)
{
    UINT32 ret;
    TSK_INIT_PARAM_S stTask1;
    TSK_INIT_PARAM_S stTask2;

    /* Initializes the spinlock */
    LOS_SpinInit(&g_testSpinlock);

    /* Create task 1 */
    memset(& stTask1,0.sizeof(TSK_INIT_PARAM_S));
    stTask1.pfnTaskEntry  = (TSK_ENTRY_FUNC)Example_SpinTask1;
    stTask1.pcName        = "SpinTsk1";
    stTask1.uwStackSize   = LOSCFG_TASK_MIN_STACK_SIZE;
    stTask1.usTaskPrio    = 5;
#ifdef LOSCFG_KERNEL_SMP
    /* Bind the task to CPU0 to run */
    stTask1.usCpuAffiMask = CPUID_TO_AFFI_MASK(0);
#endif
    ret = LOS_TaskCreate(& g_testTaskId01, & stTask1);if(ret ! = LOS_OK) {dprintf("task1 create failed .\n");
        return LOS_NOK;
    }

    /* Create task 2 */
    memset(& stTask2,0.sizeof(TSK_INIT_PARAM_S));
    stTask2.pfnTaskEntry = (TSK_ENTRY_FUNC)Example_SpinTask2;
    stTask2.pcName       = "SpinTsk2";
    stTask2.uwStackSize  = LOSCFG_TASK_MIN_STACK_SIZE;
    stTask2.usTaskPrio   = 5;
#ifdef LOSCFG_KERNEL_SMP
    /* Bind the task to CPU1 to run */
    stTask1.usCpuAffiMask = CPUID_TO_AFFI_MASK(1);
#endif
    ret = LOS_TaskCreate(& g_testTaskId02, & stTask2);if(ret ! = LOS_OK) {dprintf("task2 create failed .\n");
        return LOS_NOK;
    }

    /* Task sleep 300Ticks */
    LOS_TaskDelay(300);

    /* Delete task 1 */
    ret = LOS_TaskDelete(g_testTaskId01);
    if(ret ! = LOS_OK) {dprintf("task1 delete failed .\n");
        return LOS_NOK;
    }
    /* Delete task 2 */
    ret = LOS_TaskDelete(g_testTaskId02);
    if(ret ! = LOS_OK) {dprintf("task2 delete failed .\n");
        return LOS_NOK;
    }

    return LOS_OK;
}

Copy the code

The results

task2 try to get spinlock
task2 got spinlock
task1 try to get spinlock
task2 release spinlock
task1 got spinlock
task1 release spinlock
Copy the code

conclusion

  • Spin locks are used to solve the problem of competing for resources between CPU cores

  • Because spin locks put the CPU to sleep, the lock code should not be too long, which can lead to unexpected occurrences and affect performance.

  • It has to be done in assembly code, because C doesn’t write code that puts the CPU to true sleep, and the cores compete.

Intensive reading of the kernel source code

Four code stores synchronous annotation kernel source code, >> view the Gitee repository

Analysis of 100 blogs. Dig deep into the core

Add comments to hongmeng kernel source code process, sort out the following article. Content based on the source code, often in life scene analogy as much as possible into the kernel knowledge of a scene, with a pictorial sense, easy to understand memory. It’s important to speak in a way that others can understand! The 100 blogs are by no means a bunch of ridiculously difficult concepts being put forward by Baidu. That’s not interesting. More hope to make the kernel become lifelike, feel more intimate. It’s hard, it’s hard, but there’s no turning back. 😛 and code bugs need to be constantly debug, there will be many mistakes and omissions in the article and annotation content, please forgive, but will be repeatedly amended, continuous update. Xx represents the number of modifications, refined, concise and comprehensive, and strive to create high-quality content.

Compile build The fundamental tools Loading operation Process management
Compile environment

The build process

Environment script

Build tools

Designed.the gn application

Ninja ninja

Two-way linked list

Bitmap management

In the stack way

The timer

Atomic operation

Time management

The ELF format

The ELF parsing

Static link

relocation

Process image

Process management

Process concept

Fork

Special process

Process recycling

Signal production

Signal consumption

Shell editor

Shell parsing

Process of communication Memory management Ins and outs Task management
spinlocks

The mutex

Process of communication

A semaphore

Incident control

The message queue

Memory allocation

Memory management

Memory assembly

The memory mapping

Rules of memory

Physical memory

Total directory

Scheduling the story

Main memory slave

The source code comments

Source structure

Static site

The clock task

Task scheduling

Task management

The scheduling queue

Scheduling mechanism

Thread concept

Concurrent parallel

The system calls

Task switching

The file system Hardware architecture
File concept

The file system

The index node

Mount the directory

Root file system

Character device

VFS

File handle

Pipeline file

Compilation basis

Assembly and the cords

Working mode

register

Anomaly over

Assembly summary

Interrupt switch

Interrupt concept

Interrupt management

HongMeng station | into a little bit every day, the original is not easy, welcome to reprint, please indicate the source.