Fundamentals of Computer Systems- Chapter 1 Overview of computer systems

1. Basic working principle of computer

1.1 Basic working principle of computer

1.1.1 Basic idea of stored program: the program can only be executed after the written program and original data are sent to main storage. Once the program is started and executed, the computer automatically completes the task of taking out and executing instructions one by one without the intervention of operators. 1.1.2 Basic idea of Von Neumann structure: 1.1.2.1 Adopts the working mode of stored program. 1.1.2.2 A computer consists of five basic parts: arithmetic unit, controller, memory, input device and output device. 1.1.2.3 The memory can store not only data but also instructions. There is no difference between formal data and instructions, but the computer should be able to distinguish them. The controller can automatically execute instructions; The arithmetic unit should be able to carry out arithmetic operations as well as logical operations; The operator can use the computer through input/output devices. 1.1.2.3 Instructions and data are represented in binary form inside the computer; Each instruction is composed of an opcode and an address code. The opcode indicates the operation type, and the address code indicates the address of the operand. A program consists of a sequence of instructions.

1.2 Execution process of procedures and instructions

1.2.1 Instruction is the use of 0 and 1 sequences to instruct the CPU to complete a specific atomic operation; Instructions are usually divided into several fields, including opcodes, address codes and other fields. Opcode field indicates the operation type of instruction, such as fetch, save, add, subtract, etc. The address code field indicates the address of the operand processed by the instruction, such as register number, main memory unit number, and so on. 1.2.2 Each stage of instruction execution contains a number of microoperations, which require control signal control. In the execution process of each instruction, the operations are sequentially related, and the clock signal is needed for timing. In general, all microoperations in the CPU are timed by the clock signal, the width of the clock signal as a clock cycle. The execution time of an instruction consists of one or more clock cycles.

2. Program development and operation

2.1 Programming language and translation procedures

2.1.1 The 0/1 sequence formed by using the instruction format specified by a specific computer becomes machine language, and the program that the computer can understand and execute becomes machine code or machine language program. Each instruction consists of 0 and 1 and is called machine instruction. This language is not easy to read and write, and the introduction of a machine language symbol representation language, using short English symbols and machine instructions to establish a corresponding relationship, become assembly language, machine instructions corresponding symbol representation called assembly instructions, assembly language program must be converted into machine language program can be executed. Assembly instructions and machine instructions are related to a particular machine structure, so they belong to low-level languages, collectively known as machine-level languages. 2.1.2 Algorithm-oriented programming languages that are close to daily Written English are called high-level programming languages. 2.1.3 A computer cannot understand a high-level programming language and needs to convert it into a machine language. The conversion program is called a translator. The translated language and program are respectively called source language and source program, and the language and program generated by translation are respectively called target language and target program. Translation programs include the following three categories: 2.1.3.1 Assembler: also known as an assembler, which translates assembly language source programs into machine language target programs. 2.1.3.2 Interpreter: Also called an interpreter, which translates the statements of a source program into machine instructions in the order in which they are executed and executes them immediately. 2.1.3.3 Compiler: Also known as compiler, which translates high-level language source programs into assembly language or machine language target programs.

2.2 From source to executable

Hello.c example code is as follows:

#include<stdio.h>
int main(a)
{
    printf("hello, world\n");
}
Copy the code

Running gcc-o hello hello.c will include the following steps: 2.2.1 Preprocessing stage: The preprocessor (CPP) processes commands starting with the character #in the source program, for example, embedding the contents of the.h file following the #include command into the source program file. The output of the preprocessor is also a source file with an.i extension. 2.2.2 Compilation stage: The compiler (CCL) compiles the preprocessed source program and generates an assembly language source program file with.s as the extension. For example, Hello. s is an assembly language source program file. Because assembly language is related to specific machine structure, for the same machine, no matter what kind of high-level language, the output after compilation and transformation is the assembly language source program corresponding to the same machine language. 2.2.3 Assembly stage: The assembly program (AS) assembles the assembly source program and generates a relocatable object file with.o extension, which is a binary file in which the code is already machine instructions. 2.2.4 Link stage: The linker (LD) merges multiple relocatable object files and the relocatable object files in the standard function library into one executable object file, referred to as executable file. In this case, the connector merges Hello. o with the relocatable target module printf.o, where the standard library function printf resides, to produce the executable file Hello.

2.3 Startup and execution of executable files

The shell program executes./hello, at which point: 2.3.1 The shell program will read each character entered by the user from the keyboard into the CPU register one by one. 2.3.2 It is then saved to main memory to form a string “.hello” in the main memory cache. 2.3.3 When receiving the [Enter] key, shell will call out the corresponding service process of the operating system kernel, and the kernel will load the executable file Hello on disk into the memory. 2.3.4 After loading the code in the executable file and the data to process (in this case the string “hello, world\n”), the kernel sends the address of the first hello instruction to the program counter (PC). The CPU always uses the contents of the PC as the address of the instruction to be executed. Therefore, The processor then executes the Hello program, fetching each character in the string “Hello, World \n” loaded into main memory from main memory into the CPU register. 2.3.5 Finally send the characters in the CPU register to display on the display.

3. Computer system performance evaluation

3.1 Definition of computer performance

3.1.1 Throughput rate and response time are two basic indexes to evaluate the performance of a computer system. Throughput rate refers to the amount of work completed in a unit time. Similarly, bandwidth refers to the amount of information transmitted in a unit time. 3.1.3 Response time refers to the time taken from the start of job submission to the time of job completion. Similar concepts are execution time and waiting time, which both represent the measurement of the time taken by a task. 3.1.4 Different applications, computer users are concerned with different performance.

3.2 Computer performance test

3.2.1 If computer performance is compared directly without considering the application background, it is mostly measured by the execution time of the program. The execution time perceived by users is usually divided into the following two parts: 3.2.1.1 CPU time: the CPU time used for the execution of the program, including two parts. User CPU time refers to the time really used to run user code. System CPU time, the amount of time the CPU needs to run operating system programs in order to execute user code. 3.2.1.2 Other time: The time spent waiting for I/O operations to complete or the CPU time spent executing other user programs. 3.2.2 Computer system performance evaluation mainly considers CPU performance, but system performance is not equivalent to CPU performance. System performance refers to the response time of the system, while CPU performance refers to user CPU time, which only includes the time that the CPU runs user code. 3.2.2.1 Clock cycle: The execution of an instruction by the computer is divided into several steps (microoperation), and each step is controlled by the corresponding control signal. When these control signals are issued and the duration of their action are synchronized by the corresponding timing signal. Therefore, the computer must be able to generate a synchronized clock timing signal, the CPU’s main pulse, whose width is called the clock cycle. 3.2.2.2 Clock frequency: The main frequency of the CPU is the clock frequency of the main pulse signal of the CPU, and is the reciprocal of the CPU clock cycle. 3.2.2.3 CPI: Cycles Per Instruction (CPI) indicates the number of clock Cycles required to execute an Instruction. The required clock cycles vary depending on the function of the unused instruction. Therefore, for a particular instruction, ITS CPI refers to the number of clock cycles required to execute the instruction, in which case CPI is a definite value. For a program or machine, the CPI is the average number of clock cycles required to execute all instructions in the program or machine’s instruction set. In this case, the CPI is an average. 3.2.3.1 Calculating user CPU Time: User CPU time = Total number of program clock cycles ÷ Clock frequency = Total number of program clock cycles ∗ Clock cycle User CPU time = total number of program clock cycles ÷ Clock frequency = Total number of program clock cycles * Clock cycle User CPU time = Total number of program clock cycles ÷ Clock frequency = Total number of program clock cycles ∗ clock cycle 3.2.3.2 Given the total number of program instructions and comprehensive CPI, calculate the total number of clock cycles: ∗CPI Total clock cycles of the program = Total number of program instructions * CPI Total clock cycles of the program = Total number of program instructions ∗CPI = Total number of program instructions ∗CPI 3.2.3.3 There are n different types of instructions in known programs. The number of instructions and CPI of type I are CiC_iCi and CPIiCPI_iCPIi respectively, then: Total number of program clock cycles =∑ I =1n(Ci∗CPIi) Total number of program clock cycles = \ Sum_ {I =1}^{N} (C_i * CPI_i) Total number of program clock cycles =∑ I =1n(Ci∗CPIi) 3.2.3.4 The comprehensive CPI of programs can also be obtained by the following formula, Where, FiF_iFi represents the proportion of the ith instruction in the program: CPI=∑ I= 1N (Fi∗CPIi)= Total number of program clock cycles ÷ Total number of program instructions CPI= \ Sum_ {I= 1}^{N} (F_i * CPI_i) = Total number of program clock cycles ÷ Total number of program instructions CPI=∑ I= 1N (Fi∗CPIi)= Total number of program clock cycles ÷ Total number of program instructions 3.2.3.5 Given the combined CPI and total number of program instructions, the user CPU time can be calculated as follows: User CPU time =CPI∗ Total number of programs ∗ Clock period User CPU time =CPI x Total number of programs ∗ x Clock period User CPU time =CPI∗ Total number of programs ∗ Clock period 3.2.4 Clock period, number of instructions, and CPI are specified in the measurement formula for user CPU time For example, changing the number of instruction sets can reduce the total number of instructions in the program, but at the same time, it may cause the CPU structure to adjust, which may increase the width of the clock cycle (i.e. reduce the clock frequency). For different programs solving the same problem, fewer instructions does not necessarily mean the fastest execution, even on the same computer.

3.3 Amdahl’s law

Amdahl Law is one of the important quantitative principles in computer system design. The basic idea is as follows: The extent to which an update to a hardware or software part of the system improves system performance depends on how often the hardware or software part is used or how much of the total execution time it takes. There are two forms of representation:


Improved execution time = Improved partial execution time present Improvement multiples of improved parts + Some execution times are not improved Improved execution time = improved part execution time ÷ improved part improvement multiple + unimproved part execution time

or


Overall improvement factor = 1 / ( Improved partial execution time ratio present Improved parts improved multiples + Unimproved partial execution time ratio ) Overall improvement ratio = 1 / (proportion of improved part execution time ÷ proportion of improved part improvement ratio + proportion of unimproved part execution time)

Amdar’s law is widely used in performance analysis of parallel computing systems. Amdar’s law applies to all situations where a part of a particular task is optimized, either for hardware or software. For example, the execution time of an exception handler on a system is only a small fraction of the total program running time, and even a very good optimization of the exception handler will result in very little performance improvement on the overall system.