Series is introduced

This series mainly introduces J.U.C. concurrent programming in Java, from principle, theory to practice process, take you step by step to understand a variety of knowledge points, all technical points constitute a closed loop, forming a knowledge system.

Hope to gain new understanding and cognition of you in J.U.C series.

As a first step, I would like to start this series with the underlying model of the computer, because only when you understand the principles and structure of the computer can you have a deeper understanding and use of Java design (J.U.C, Sync, JMM).

This section does not cover Java-related knowledge.

Modern theoretical computer models

Modern computer models are based on the von Neumann computer model

Also known as the Von Neumann model or Princeton Architecture, it is a computer design concept that combines program instruction memory and data memory. A computer based on the von Neumann structure is called a von. Neumann computer, also known as a stored program computer.

When the computer is running instructions, it will take out instructions one by one from the memory, through decoding (controller), take out data from the memory, and then carry out specified operations and logic operations, and then return the operation results to the memory by address.

Next, take out the next instruction and operate in accordance with the provisions in the controller module. And so on. Until a stop command is encountered.

Program and data storage, according to the sequence of program arrangement, step by step to take out the instruction, automatically complete the operation prescribed by the instruction is the computer’s most basic working model. This principle was first developed by Hungarian American mathematician Feng. Neumann proposed it in 1945, so it was called Feng. Neumann computer model.

The five core components of a computer

  • Controller:

    • Is the central nerve of the whole computer, its function is to interpret the control information provided by the program, control according to its requirements, schedule the program, data, address, coordinate the work of each part of the computer and access to memory and peripherals.
  • Arithmetic unit (Datapath)

    • The function of the arithmetic unit is to carry out all kinds of arithmetic and logical operations on the data, that is, to process the data.
  • Storage (Memory)

    • The function of memory is to store programs, data and various signals, commands and other information, and provide these information when needed.
  • Input (Input system)

    • Brief, input equipment keyboard, mouse, etc.
  • Output (Output system)

    • A little. Printers, etc.

Above is a flow chart of a computer model

  • The calculator
    • It’s actually the CPU’s job
  • memory
    • RAM in your calculator

The main point of the figure above is just to look at the middle part, the essential logic is CPU, storage. How the CPU stores data, computes; How the CPU and storage communicate with each other.

Principles of modern computer hardware architecture

The schematic diagram of the calculator hardware is shown below

Expansion slot: refers to the memory module.

We can focus on the CPU,I/O bus, expansion slots, so why is the structure designed this way?

Whether the CPU, memory, or the display, mouse, or keyboard in our calculators communicate with each other through the I/O bus.

The I/O bus can be understood as a high-speed channel, in it, the CPU to achieve the highest frequency of GHz frequency far unable to compare with the CPU and memory chips, and played the game friend also know, memory of calculator is also would have to do on the I/O bus communication, so many modules on it, the CPU is high frequency.

Therefore, the structure principle of THE CPU will have a CPU Cache design, which is to copy the received instructions to the CPU Cache, for calculation. Register > L1 > L2 > L3 > memory, and memory read and write speed is far less than CPU Cache, so this is also one of the reasons for the design of CPU Cache.

Because the frequency of memory is far less than that of CPU, so there will be the appearance of CPU Cache. Memory puts compiled instructions through I/O bus into CPU Cache for calculation and storage.

CPU

CPU internal structure division, there are three main types of units

  • The control unit

    • The control unit is the command and control center of the whole CPU, which is composed of Instruction Register (IR), Instruction Decoder ID (Instruction Decoder) and Operation Controller (OC). It is very important to coordinate the orderly work of the whole computer. According to the user’s pre-programmed program, it takes out each instruction from the memory in turn and puts it in the instruction register IR. Through the instruction decoding (analysis), it determines what operation should be carried out. Then, through the operation controller OC, it sends out the microoperation control signal to the corresponding parts according to the determined timing sequence. Operation controller OC mainly includes: beat pulse generator, control matrix, clock pulse generator, reset circuit and power – off circuit control logic.
  • The operation unit

    • Arithmetic unit is the core of arithmetic unit. You can perform both arithmetic operations (including basic operations such as addition, subtraction, and multiplication, and their add-ons) and logical operations (including shifts, logical tests, or comparison of two values). Relative to the control unit, the arithmetic unit accepts the command of the control unit to carry out the action, that is, all the operations carried out by the arithmetic unit are commanded by the control signal issued by the control unit, so it is the execution component.
  • Storage unit

    • A storage unit consists of the in-chip CPU Cache, Cache and register group. It is the place where data is temporarily stored in the CPU. The data that is waiting to be processed or has been processed is stored in the storage unit. Registers are internal components of the CPU, registers have very high read and write speed, so data transfer between registers is very fast. The use of registers can reduce the number of CPU access to memory, thus improving the CPU speed. The register group can be divided into special register and general register. The function of special register is fixed, storing corresponding data respectively. The universal register is used.

CPU registers

Each CPU contains a series of registers that are the basis of the memory within the CPU. The CPU can perform operations on registers much faster than it can on main memory. This is because the CPU accesses registers much faster than main memory.

CPU cache

Cache memory, is located between the CPU and the main memory of a small capacity but high speed memory. Because the SPEED of the CPU is much higher than that of the main memory, it takes a certain period of time for the CPU to read or write data from the memory. The Cache stores some data that has just been used or recycled by the CPU. When the CPU uses this data again, the CPU can directly invoke the data from the Cache, reducing the CPU waiting time and improving the system efficiency.

memory

A computer also contains one main memory. All cpus have access to main memory. Main memory is usually much larger than the cache in the CPU. ,

The above diagram refers to how memory interacts with CPU. We have a general idea of the memory structure. We hope to help you understand the overall structure and understand how the computer works.

Problem Example 1

	public static void main(String[] args) {
		int i = 0;
		i = 1 + 1;
		System.out.println(i);
	}
Copy the code

If the main method of the command is executed, the CPU and memory are read and stored in the following process

2. The CPU register will load the memory address of I and then send it to ALU for calculation. The calculation result (I =2) will be cached to L1, L2, L3. 3. The CPU will synchronize the results to the memory when it is idle, not immediately synchronized to the memory, synchronization conditions, only in its own cache memory space is insufficient, will be written to the memory synchronization, then is there any way to hard synchronization results to the memory? Here’s an extended concept: the MESI cache consistency protocol

CPU multi-core cache architecture

Problem Example 2

Two threads, T1 and T2, go to CPU1 and CPU2 to execute the following code method

    private static int i = 0;
	public static void main(String[] args) {
		i +=1;
		System.out.println(i);
	}
Copy the code

According to the above structure, each CPU are independent, and each thread has its own for a copy of the I, also is the I + 1, each CPU in the writing synchronous data results, do not know other CPU for the result of the memory address in the I calculated to write back, so it’s possible to have a calculation error.

When the respective threads execute instructions on the CPU, the actual result is not I + 1(T1) + 1(T2), but may be I = 2, which leads to our data consistency problem.

Cache consistency issues

In a multiprocessor system, each processor has its own cache, and they share the same MainMemory. Cache-based storage interaction is a good solution to the processor/memory speed contradiction, but it introduces a new problem: cache consistency. When multiple processors work on the same main memory area, it is possible to have inconsistent cache data. If this happens, whose cache data will be used when synchronizing back to main memory? To solve the consistency problem, it is necessary for each processor to follow some protocol when accessing the cache, and to operate according to the protocol when reading and writing. These protocols include MSI, MESI (IllinoisProtocol), MOSI, Synapse, Firefly, and DragonProtocol.

Bus lock (Pentium processor) this is a long time ago a CPU implementation method, the principle of this is that every time the CPU wants to write data back to memory, it needs to go to the bus to obtain a lock, after obtaining the lock can write data to memory. A CPU that does not acquire a lock will have to wait until the lock is acquired.

The msci agreement

Cache line: the smallest Cache unit in the Cache

  • M

    • Status: Modified
    • Description: This Cache line is valid, the data is modified, and the data is inconsistent with the data in memory. The data only exists in this Cache.
    • Listening task: The Cache line must always listen for all attempts to read the Cache line relative to memory. This operation must be deferred until the Cache writes the Cache line back to main memory and changes the state to S(shared).
  • E

    • Status: Shared
    • Description: This Cache line is valid. The data is consistent with the data in memory. The data exists in many caches
    • Listening task: The Cache line must listen for requests from other caches to invalidate or monopolize the Cache line and change the status of the Cache line to invalid (Invaild).
  • I

    • Status: Invalid (Invaild)
    • Description: This Cache line is invalid and cannot write data back to main memory.
    • Listening task: None

The CPU Cache line changes the state of its Cache line by sniffing the BUS(Cache consistency protocol) for new state changes and instructions (#LOCK, etc.).

Problem Example 2- Solution

  1. T1 loads instructions from main memory to CPU1, and then changes the corresponding Cache line state toS (monopoly)State;
  2. The same is true for T2, because two cpus are fetching data from the same main memory, so T2’s Cache line becomes oneS (Shared)The Cache line in T1 will change fromS(exclusive) --> E(shared)Make the transition.
  3. T1 and T2 will then compute the instructions from L3 to L2 to L1 to the register, and then write back to L3. Because the data is modified, the CPU in T1 needs to lock the Cache line, and then change the status toM (modified)At this time(I = 2); Of course, T2 can also be modified at the same timeM (state)Depending on who is faster, cpus also have time delays between each other.
  4. When a T1 state change is complete, a message is sent to the BUS (Cache Consistency Protocol) at the same time to inform other listeners of the memory’s Cache line.
  5. The CPU has an instruction cycle to determine the status of the Cache line. The T2 Cache line listens for data changes(I = 2), will update its own status toI(Invaild)Status. No more data can be updated(T2 i = 2)Into main memory.
  6. If T2 wants to update data into main memory again, it needs to load data from main memory again(I = 2)The ALU is recalculated into the CPU and then written back to main memory(I = 2 + 1)

Cache line status failure scenario

  1. wheniIf the storage length is larger than one Cache line, multiple Cache lines need to be stored. MESI Cache consistency protocol cannot be implemented in this case, only bus locks can be used.
  2. When the CPU does not support MESI

summary

Reviewing this chapter, we learned about the computer model, the communication workflow between CPU and memory, and how to ensure cache consistency (MESI).