Computer composition principle reading notes

1. A brief history of computer development

As shown in the figure below, computer development has gone through four stages

1.1 Vacuum tube (vacuum tube) computer

It couldn’t be programmed the way we write programs today. Only predetermined operations can be performed, and operation information is sent to the circuit via a hard wire. Even when wiring cables and switches are used, programming can only be done by rewiring.

1.2 Transistor computer

In 1948, three scientists at AT&T Bell Laboratories invented the transistor, which later led to the development of the semiconductor, which was functionally equivalent to the vacuum tube, but smaller and consumed less power. The invention of the transistor made it possible to put multiple transistors on a piece of silicon to form a complete circuit.

1.3 Integrated circuit computer

IBM came up with 7094 and 1401, which had different main features and weren’t compatible with each other. So IBM came out with the compatible Production System/360(a prototype operating System), leading to the concept of computer architecture (instruction set architecture).

1.4 VlSI computer

Superscalar processing and out-of-order execution are another important change in the composition of computers. Superscalar processing involves reading several instructions from memory and executing them in parallel. Out-of-order execution refers to the execution of instructions in a different order from the order in the program, in order to avoid waiting for the execution of a certain instruction, thus speeding up the execution of instructions. Out-of-order execution allows later instructions in a program to be executed while the current instruction waits for a resource that is in use. Using a recipe analogy, out-of-order execution is the equivalent of preparing dessert while cooking the main course. Intel introduced out-of-order execution in its Pentum family of processors.

X=2 * C 2. Y -C + 4 3.Z=X + y 4.4-4 * C 5. P-c-3 Sometimes a computer can be made faster by changing the order in which instructions are executed. In this example, instructions (4) and (5) can be executed at any time, but instructions (3) must be executed after instructions (1) and (2) have finished.

1.5 Development of storage technology

In the 1930s, John Vincent Atanasov invented one of the earliest storage devices, a rotating magnetic drum covered with capacitors that charged and stored 1s and 0s. As the drum rotates, capacitors pass under a row of contacts and their values are read.

In the 1940s, mercury ultrasonic delay lines were used to store data, like a series of ultrasonic pulses traveling along a thin tube filled with mercury. As the pulse signal travels from one end to the other, it is amplified and recycled again. This is true dynamic storage. The first fast data storage device was invented by Frederick Williams at the University of Manchester in England. Cathode-ray tubes (first used in radar displays and later in television sets) charge by shining an electron beam at a point, thus storing data on its surface (an electronic version of atanasov’s rotating drum). The first Generation of Williams tubes could store only 1024 bits, later doubling to 2048 bits.

In 1949, Forester designed ferrite core memory. A core is a small ring of magnetic material that can be magnetized clockwise or counterclockwise. By the 1970s, ferrite core memory had become the mainstay of mainframe memory. In fact, ferrite core memory gave us the term core store, which is still used occasionally to describe mass external memory.

In the 1970s, semiconductor dynamic memory was invented as an alternative to core memory and has now become the standard means of data storage. Today, we can easily achieve 8 gigabytes of storage capacity with low-capacity DRAM modules, which pack several memory chips on a small circuit board.

In 1956 IBM introduced a disk-storage mechanism in its Statistical Controlled Random Access Method (RAMAC) that stored data on the surface of a rotating disk. RAMAC 305 disks can store approximately SMB data at 1200 RPM. Since then, disk performance has improved. Now. The maximum capacity of the disk is about 4T (2″) bytes. Due to the mechanical nature of the disk, it typically spins at 7200 RPM, only six times faster than the original RAMAC. Let’s say it again: Modern disks have millions of times the capacity of RAMAC, but only six times the speed. This detail points to a bottleneck in the development of computers: the speed at which the various components of the computer develop is uneven. However. Solid-state disk technology, now used to replace hard disks, has very fast, fully electronic drives.

Today, personal computers use optical memory (DVD or Blu-ray discs) to import programs or store data. Optical storage technology stores information on spiral rails on transparent polycarbonate discs. Information is read from or written to a rewritable disc using laser technology. CD technology was invented between 1958 (invention of laser) and 1978 (usable optical memory). DVD (invented in 1997) is the improved CD, blu-ray disc (2006) is the improved DVD.

2. Computer system and structure

2.1 What is computer System Architecture

“The media have long referred to a microprocessor or even a chip as a computer system. In fact, a computer system consists of the central processing unit (CPU) that reads and executes programs, memory that holds programs and data, and other subsystems that convert chips into useful systems. These subsystems connect the CPU with the display and printer. Communication between external devices such as Intermet has become easier.

The part of the computer that actually executes the program is called the CPU or, more simply, the processor. A microprocessor is a CPU implemented on a single silicon chip. Computers built around microprocessors are called microcomputers.

Although the CPU is at the heart of the computer, the performance of the computer depends on the CPU as well as the performance of other subsystems. There is no point in simply improving CPU performance without efficient data transfer. Computer scientists have joked that improving microprocessor performance is simply making the CPU more flat to start waiting for data from memory or disk drives. The speed at which different parts of a computer can improve their performance is a major problem faced by current computer system designers because the speed at which different parts can improve their performance is not uniform. For example, while processor performance has continued to grow rapidly over the past few decades, hard disk performance (access time) has remained almost constant over the past 30 years. That predicament may end, however, as semiconductor hard drives such as solidstate disks (SSDS) begin to replace mechanical drives.

The following diagram depicts the structure of a simple general-purpose computer, such as a personal computer or workstation. In addition to the CPU, there are some components found in almost all computers. Information (that is, programs and data) is kept in memory. Real computers use different types of memory for different purposes. The following figure shows multiple storage tiers, including Cache, main storage, and secondary storage. Note that although the Cache in this figure is outside the CPU, most processors today have an on-chip Cache integrated within the CPU.

2.2 Von Neumann system

Early digital computers were hardwired to perform specific tasks. Hard-wired means that the computer’s functions (programs) can only be changed by rewiring. A general-purpose computer uses hardware (that is, actual digital circuits) to do a wide range of tasks as directed by a program.

The figure above describes in detail the structure of a digital computer, which can be divided into two parts: the central processing unit and the memory system. The CPU reads the program and does what the program specifies. Memory systems hold two types of information, programs and data processed or produced by programs.

Note that few computers have two separate information paths between CPU and memory as shown above. Most computers have only one information channel between the CPU and the memory system, and data and instructions alternate along this channel. The two paths are drawn in the figure above to emphasize that memory holds both the instructions that make up the program and the data used by the program.

Von Neumann architecture

• There must be a memory

• There must be a controller

• Must have an arithmetic unit

• There must be an input device

• There must be an output device

Von Neumann machine features:

  • Able to send required programs and data to the computer
  • Ability to remember programs, data, intermediate results, and final results for a long time
  • Capable of arithmetic, logical operation and data transmission
  • Ability to output processing results to users as required

Modern computer architecture

2.3 Instruction execution process

The figure above depicts the process of executing the z=x+y instruction. The CPU must first fetch an order from memory. After the CPU analyzes or decodes the instruction. It will read from memory all the data needed for this instruction. The first instruction, LOAD X, reads the value of variable X from memory and stores it temporarily in a register. The second instruction reads the value of variable Y from memory and stores it in another register. Then, when the CPU reads the third instruction, it adds the contents of the two registers and saves the result in the third register. The fourth instruction writes the result of the addition back to store Z.

There is also some truth in saying that all a computer does is read data from memory, perform calculations (add, multiply, etc.) on the data, and then write the results back to memory. Another thing a computer can do is test the data (that is, determine whether a number is zero and whether its sign is positive or negative) and then execute one of two alternative instruction streams based on the test results.

Six basic instructions

MOV A, B copies the value of B to A; LOAD A, B copies the value of memory unit B to register A (copies data from memory to CPU) STORE A, B copies the value of register B to memory unit A (copies data from CPU to memory); ADD A, b ADD A and B, save the result to A; (Covers all data processing instructions, such as subtraction, multiplication and even the instruction that returns the maximum value of A and B) BEQ Z If the last test result is TRUE, execute the code at address Z, otherwise continue.

A register is a storage unit inside a CPU that holds data. The clock provides the pulse flow, and all internal operations are triggered by the clock pulse. A register is a storage unit used to hold one unit of data or word data. A register is usually described in terms of the number of bits of data it holds, typically 8, 16, 32, and 64 bits. Registers are not fundamentally different from word storage units in memory. The actual difference is that the registers are inside the CPU and can be accessed much faster than memory outside the CPU.

The clock is used to generate a stream of regularly spaced electrical pulses for the wires. They are called clocks because these electrical pulses can be used to keep time or to determine the order of all the events in a computer. For example, a processor might execute a new instruction each time a touch pulse arrives. A clock can be defined by its repetition rate or frequency. Typical computer clock frequencies range from 1MHz to 4.5ghz. A clock can also be defined by the width or duration of the clock pulse, i.e. the reciprocal of the frequency (f= 1/T). For example, the duration of a 1 MHZ clock signal is 1μs, and the duration of a IGHz clock signal is 1 x 10-9s or LNS. A 5GHz clock signal has a period of 200ps (ps= picoseconds). Light travels about 2 inches in 200ps. Digital circuits whose events are triggered by the clock signal are said to be synchronous because they are synchronized by the clock signal. Some events are asynchronous because they can happen at any time. For example, if you move the mouse, it will send a signal to the computer. This is an asynchronous event. However, the computer can detect the state of the mouse at each clock pulse, which is a synchronous event.

2.4 Computer programming languages

Code executed on a computer is represented as a string of binary ones and zeros, known as machine code. Each computer can execute only one particular machine code. Human-readable machine code (e.g. ADD R0, Time) is called assembly language. Code that runs on a completely different type of computer and has little to do with the underlying computer architecture is called a high-level language (such as C or Java). Before execution, high-level language programs must first be compiled into the computer’s native machine code.

Program translation and program interpretation

The computer eventually executes instructions L0

  • The translation process generates new L0 programs, the interpretation process does not generate new L0 programs
  • The interpreter written by L0 interprets the L1 program

Program translation language: C/C ++, object-C, Golang and other program interpretation language: Python, Php, JavaScript and other translation + interpretation: Java, C#

Computer level

  1. Hardware logic layer
  • Door, trigger and other logic circuit composition
  • It belongs to the field of electronic engineering
  1. Microprogram machine layer
  • The programming language isMicro instruction set
  • microsMade up ofThe micro programDirect execution to the hardware
  1. Traditional machine layer
  • The programming language isCPU instruction set (machine instruction)
  • Programming languages and hardware are directly related
  • Different architectures use different CPU instruction sets
  • Microinstruction < microprogram = machine instruction
    • A machine instruction corresponds to a microprogram
    • A microprogram corresponds to a set of microinstructions
  1. Operating system layer
  • Provides a simple operation interface
  • The command system is connected down to manage hardware resources
  • The operating system layer is the adaptation layer between software and hardware
  1. Assembly language layer
  • The programming language is assembly language
  • Assembly language can be translated into directly executable machine language
  1. High-level language layer
  • Programming language a high-level language accepted by programmers
  • There are hundreds of categories of high-level languages
  • Common high-level languages include Python, Java, C/C++, Golang, etc

The instruction set architecture includes the data type (the number of bits per word and the meaning of each bit), the registers used to hold temporary results, the type and format of the instructions, and the addressing method (a way of indicating where the data is stored in memory).

Microcode has nothing to do with the microprocessor. Microcode defines a set of basic operations (microinstructions) by which the execution of machine code can be interpreted. ADD P, 0, R is a typical machine instruction, while microinstructions can be as simple as “move data from register X to bus Y”. How microinstructions are defined is the responsibility of the chip designer.

External architecture refers to the higher-level aspects of the computer architecture, such as data structures and instruction sets. The prefix exo means external and represents the computer abstraction seen by assembly language programmers.

The internal architecture represents the internal components of the computer, including the performance of the computer’s basic units, how the components are connected, and how the flow of information is controlled. That is, the internal architecture describes the processor at the level of functional components such as registers, adders, and control circuits.

Microarchitecture, which describes some of the actions that must be done to execute machine instructions (such as copying data from one register to another). The operations performed by the internal architecture are implemented by the microarchitecture. For example, the internal architecture only focuses on the functions of the adder, while the microarchitecture focuses on how the ALU implements those functions. Thus, an external architecture is an abstract view of a computer by a programmer, which is implemented by an internal architecture, which in turn is implemented by a microarchitecture by executing microprograms.

The external architecture is the highest level and represents the view of the computer as seen by the programmer. The internal architecture is an intermediate level, which describes the computer composition from the aspects of the component modules and the interconnection between modules. Microarchitecture (the lowest level) describes how the building blocks of a computer are implemented at the gate level.

3. Stored program computer

3.1 Simulated computer execution program

Find the maximum run, the maximum number of consecutive occurrences of the same number.

We use R to indicate that the state is in the same sequence, NR to indicate that it is not, as shown in the following figure

Current position of the I string New_Digit Value of the number just read from the number string Current_Run_value Current sequence value Current_Run_length Current sequence length Max_Run Maximum sequence length so farCopy the code

According to the above defined values, we get the following figure

Here is the pseudocode

1. Read the first digit of the string and call it New_Digit 2. Set the Current_Run_Value value to New_Digit 3. Set Current_Run_Length to 1 4. REPEAT 6. Read the next digit in the sequence (i.e. Read New_Digit) 7. IF its value is the same as Currenz_Run_Value 8 ELSE {Curxent_Run_Length = 1 10.current_run_value = New_Digit} 11.if Current_Run_Length > Max_Run 12. THEN Max_Run = Current_Run_Length 13Copy the code

Line 1 reads a number from a string that must be kept somewhere in computer memory. The symbol name New_Digit indicates the location of the number in memory. The computer must ensure that it can access this location in memory whenever it needs to use the current value of the current sequence. Line 2 is an assignment operation that assigns a value to a variable. Similarly, line 3 and line 4 are assignment operations, setting the values of the variables current_Run_Length and Max_Run to 1, respectively. The computer must be able to read a number from memory, modify it, and write the modified number back into memory. Line 5 contains the keyword REPEAT, which tells us that this is the starting location for a set of actions that will be performed one or more times. This group of operations ends with the keyword UNTIL on line 13. Lines 7, 8, and 10 illustrate conditional execution, that is, the type of action to be performed depends on the test results. Line 7 compares whether the value read from the string is the same as the value of the current sequence (that is, New_Digit is the same as Current_Run_value). Then perform one of the following two operations based on the comparison results. One operation is specified by the text after the keyword THEN on line 8, and the other by the text after the ELSE on lines 9 and 10. Lines 9 and 10 are enclosed in a pair of curly braces, indicating that they will be treated as a whole on the ELSE path when executed.

We introduce some simple definitions based on the names of the above sections

Constants – values that are not modified during program execution; For example, if the circumference of a circle is expressed as c=2πr, then both 2 and π are constants. Variables – Values that change during program execution. In the example above, c and r are both variables. Symbol names – Names used to refer to variables and constants for ease of memorization and understanding. For example, we call the circumference of a country C and its radius R. The symbol for PI 3.1415926 is PI. When the program is compiled to machine code, the symbolic name is replaced with the actual value. Addresses – Computers keep information in memory, and each location has a unique address. For example, the radius of a circle might be stored in a cell with storage address 1234. Instead of remembering the address of each variable in memory, we use a different symbolic name to represent each address. The address 1234 above could be called R. Value and Position — what does r in the expression e=2πr stand for, value or position? People think of r as a sign name for the value of the radius of the circle, like 5. The computer, however, treats r as a symbolic name for address 1234 that must be read in memory to get the actual value. Does the expression r=r+1 change the value of radius (r= 5+1=6) or the value of address (r=1234+1=1235)? It is important to distinguish between the address of a storage unit and its contents, especially when understanding the concept of Pointers. Pointer – A pointer is a variable whose value represents a storage address. When you modify a pointer, it points to a different value. Pointers are not mysterious. In traditional arithmetic, the I in the symbol xi is a pointer; But we usually refer to it as an index. Changing the value of a pointer (or index) (such as X1, x2, x3, and x4) accesses the element at the corresponding position in a table, array, or matrix.

3.2 Memory Mapping

Each location in memory holds either instructions or data elements

We simulate the following program execution, but we have to omit some details for simplicity. For example, we place a jump instruction at address 10, which tells the computer to ignore the instructions at addresses 11 and 12 and directly execute the instruction at address 13. This is necessary because if the THEN part of the branch is executed, its ELSE part must be ignored. It also shows how to access each number using the symbol Memory(I), which represents the i-th location of Memory. The value of I is initialized to 21, and the loop ends when I equals 37.

0 i = 21
1 New Digit = Memory(i)
2 Sets the value of the Current_Run_Value variable to the value of the New_Digit variable
3 Set the value of the Current_Run_Value variable to 1
4 Set the value of Max_Run to 1
5 REPEAT
6 i = i + 1
7 New Digit = Memory(i)
8 IF New Digit Current_Run_Value
9 THEN Current_Run_Length = Current_Run_Length + 1
10 Skip to line 13
11 ELSE Current_Run_Length = 1
12 Current_Run_Value = New_Digit
13 IF Current_Run_Length > Max_Run
14 THEN Max_Run = Current_Run_Length
15 UNTIL i = 37
16 Stop
17 New Digit
18 Current_Run_Value
19 Current_Run_Length
20 Max_Run
21 2 first number
22 3
23 2
24 7
. .
37 1 string last number

The following figure shows the components of a storage system. The processor sends an address placed on the address bus and a control signal to select read or write operations (they are sometimes called read or write cycles) to the memory. During the read cycle, the memory places data on the data bus for the CPU to read. During the write cycle, data placed on the data bus is written to storage. The locations where information enters or leaves memory (or other functional parts of a computer system) are called ports. Although the memory in the figure below is a simplified version, it accurately describes a computer memory in which data and instructions are stored consecutively. A real computer uses storage system hierarchies (each of which may be implemented using different technologies). These layers include very fast caches that hold frequently accessed data, main memory, and very slow secondary storage, where large amounts of data are stored on disks, CDS, or DVDS until they are called into main memory for use.

RTL symbol

In RTL, we usually use square brackets [] to indicate the contents of a storage unit. The expression [15] = Max_Run means “the storage unit at address 15 holds the value of the variable Max_Run”. The left arrow symbol ← indicates the data transfer operation. For example, the expression [15]←[15]+1 means “add 1 to the value of the storage cell at address 15 and write the result back to the storage cell at address 15”. Consider the following three RTL expressions: A.[20] = 5 b.[20] ← 6 C.[20] ←[6] Expression (a) indicates that the value of the storage unit at address 20 is equal to the number 5. Expression (b) writes (copies or loads) the number 6 to the storage location at address 20. Expression (c) copies the contents of storage cell at address 6 to storage cell at address 20. Note that the RTL symbol ← is equivalent to the traditional assignment symbol =, which is used in some high-level languages. RTL is not a computer language, it is just a notation used to define computer operations.

Fetching each instruction from memory requires one fetch operation (read memory). Executing an instruction on the machine requires at least two retrievals. The first fetch is the read instruction. The second fetch either reads the data required by the instruction from memory or writes back the data generated or modified by its previous instruction. Sometimes called a stored program computer operates in a two-phase fetch/execution cycle.

3.3 Instruction Types

  1. Three address instruction

[Address1] ← [Address2] operation [Address3] It requires a total of four fetches (that is, one fetches the instruction, two fetches the source operands, and one saves the result).

The following figure describes how to select an operation with an opcode, two storage units with a source address, and storage units with a destination address to write back to operands. The figure also illustrates the flow of information generated during the execution of the addition instruction.

  1. Two address instruction

Operation Address1,Address2 Where Address2 is the source operand, Address1 is both the source and destination operand. This instruction format means reading the source operand from memory, operating on it, and writing the result back to the location of the first source operand in memory. The RTL of instruction ADD P, Q is defined as [P] ← [P] +[Q].

The two-address instruction breaks one of its operands. That is, it replaces the source operand P with the result.

On a real computer, two storage addresses in the same instruction are generally not allowed. Most computers (such as the Pentium or the more modern Core I7 processors) specify that one address is a memory address and the other is a register address.

  1. Single address instruction

Because the instruction provides only one operand address and requires at least two addresses, the processor has to use a second operand that does not require an explicit address. That is, the second operand comes from a register inside the CPU called an Accumulator.

Addressing mode is an important feature of computer instruction set. It is a method to determine the position of operands. For example, the location of the operand can be given either directly (that is, its address is 1234) or indirectly (that is, the contents of register 5 are the address of the operand we want)

3.4 Storage Layers

Computer scientists think of memory as a huge array accessed by address.

The register stores the working data of the processor, the Cache is the fast memory that caches common data, the DRAM stores the working data blocks, and the hard disk stores programs and data. Note that the hard disk has 40 million times the capacity of the register but is 20 million times slower.

The processor accesses the fast L1 (level 1) Cache, the part of the CPU where it expects 92% of the information to be found. If the data is not in the L1 Cache, the larger but slower L2 (level 2) Cache is accessed. Maybe there’s a 98% chance of finding data there. If this fails, the computer may also go to another Cache – the level 3 Cache. If there’s nothing there, you have to pull the data out of main memory.

How to keep the data in Cache and disk consistent is the main concern of computer designers.

3.5 the bus

A bus connects two or more functional units of a computer and allows them to exchange data with each other (for example, a bus between a CPU and a graphics card). The bus also connects the coji machine with coines (for example, when the printer is connected to the USB bus of the coji).

The problem with the bus structure is that only one device can communicate with other devices at a time, because there is only one information path. If two devices request the bus at the same time, they have to compete for control of the tunnel. Some systems use a special part called a mediator to decide which devices are allowed to continue working, while other competitors have to wait their turn.

Width – The width of the bus is generally defined by the number of parallel data paths. A 64-bit wide bus can transmit 64 bits (8 bytes) of information at a time. However, the term is also used to refer to the total number of connections that make up the bus. For example, a bus may have 50 information paths, of which 32 are used to transmit data (the rest may be control paths or even power lines). Bandwidth – Bus bandwidth is a measure of the speed at which information is transferred across the bus. Bandwidth is either in B/s or B/s. Increasing the width of the bus while keeping the data transfer rate constant increases bandwidth. Delay – Delay is the time between the sending of a data transfer request and the actual data transfer. The bus delay usually includes the time for bus arbitration before the transfer begins.

Multibus system

Why use two buses? First, multiple buses allow concurrent operations. For example, two devices can communicate with each other over bus A while another pair of devices can communicate with each other over bus B. A more important reason is that these buses may have completely different characteristics and operating speeds.

4. The unit of account of a computer

4.1 Capacity Unit

  • At the physical level, high and low levels record information
  • Theoretically, only two states of 0/1 can be recognized. High level records 1 and low level records 0. 0/1 is called bit (bit).
  • 0/1 represents too little and requires a larger capacity representation

1G = 1024^3Bytes = 1024^3*8bits

4.2 Speed unit

4.2.1 Network speed

Why is the test peak speed of 100M/s fiber only 12M/s? In this case, 100M is not the memory, but the speed of 100M/s. The common unit for networks is Mbps.

  • 100M/s = 100Mbps = 100Mbit/s
  • 100 mbit/s = (100/8) MB/s = 12.5 MB/s

4.2.2 CPU frequency

CPU speed is the number of 1/0 level changes per second

  • The CPU speed is generally reflected in the CLOCK frequency of the CPU
  • The clock frequency of the CPU is usually measured in Hertz (Hz)
  • The clock frequency of the main cpus is above 2GHz

2 GHZ = 2 x 1000^3Hz = 2 billion cycles per second