This is the 19th day of my participation in the August Wenwen Challenge.More challenges in August

In terms of system architecture, how should servers be classified?

In terms of system architecture, current commercial servers can be generally divided into three types, namely, Symmetric multi-processor (SMP) and Non-Uniform Memory Access,NUMA), and Massive Parallel Processing (MPP).

SMP(Symmetric Multi-processor)

The so-called symmetric multi-processor architecture refers to that multiple cpus in a server work symmetrically without primary, secondary or subordinate relationships.

All cpus share the same physical Memory, and it takes the same time for each CPU to Access any address in the Memory. Therefore, SMP is also called Uniform Memory Access (UMA). SMP servers can be expanded by adding memory, using faster cpus, increasing CPUS, expanding IO(number of slots and buses), and adding more external devices (usually disk storage).

The main characteristic of an SMP server is sharing. All resources in the system (such as CPU, memory, I/O, and so on) are shared.

It is this feature that leads to the main problem with the SMP server, which is its very limited scalability.

For an SMP server, each shared segment can be a bottleneck for the SMP server to scale, and the most limited is memory.

Because each CPU must access the same memory resources through the same memory bus, as the number of cpus increases, memory access conflicts will rapidly increase, resulting in the waste of CPU resources, greatly reducing the effectiveness of CPU performance.

Experiments show that the optimal CPU utilization of SMP servers is 2-4 cpus.

NUMA(Non-Uniform Memory Access)

NUMA is one of the results of efforts to develop techniques for effectively scaling up large systems due to SMP’s limited ability to scale.

Using NUMA technology, it is possible to combine dozens (or even hundreds) of cpus into a single server. The CPU module structure is shown in the figure.

A NUMA server is characterized by multiple CPU modules. Each CPU module consists of four cpus and has independent local memory and I/O slots.

Each CPU has access to the entire system’s memory (an important difference between NUMA systems and MPP systems) because its nodes can connect and exchange information through interconnected modules, such as Crossbar Switches.

Obviously, access to local memory will be much faster than access to remote memory (memory of other nodes in the system), which is where nonuniform storage access NUMA comes in.

Because of this feature, applications need to be developed with minimal information interaction between different CPU modules in order to maximize system performance.

NUMA technology can be used to solve the expansion problem of the original SMP system, which can support hundreds of cpus in a physical server.

However, NUMA technology also has some drawbacks. Because the latency of accessing remote memory is much longer than that of accessing local memory, system performance cannot increase linearly when the number of cpus increases.

For example, when HP released the Superdome server, they published the relative performance of the Superdome server compared to HP’s other Unix servers. It was found that the Superdome server with 64 cpus (NUMA architecture) had a relative performance of 20, while the 8-way N4000(shared SMP architecture) had a relative performance of 6.3. As you can see from this result, eight times the number of cpus is only a threefold improvement in performance.

MPP (Massive Parallel Processing)

Different from NUMA,MAA provides another way to extend the system. Multiple SMP servers connect to each other over a certain node network and work together to complete the same task. From the perspective of users,MAA is a server system.

It consists of multiple SMP servers (each SMP server is called a node) connected through the network of nodes. Each node accesses only its own local resources (memory and storage).

MPP is a completely shared-nothing structure, and therefore the most scalable. In theory, MPP can scale indefinitely. Current technology can connect up to 512 nodes. Contains thousands of cpus.

At present, there is no standard for node interconnection network in the industry, and they all adopt different internal implementation mechanisms. However, the node network is only for internal use of the MPP server and is transparent to users.

In the MPP system. Each SMP node can also run its own operating system, database, etc. Unlike NUMA, however, there is no problem with remote memory access. In other words, the CPU in each node cannot access the memory of the other node. The information interaction between nodes is realized through node interconnection network. This process is commonly called Data Redistribution.

But THE MPP server requires a sophisticated mechanism to schedule and balance the load and parallel processing of individual nodes.

Currently, servers based on MPP technology tend to mask this complexity through system-level software, such as databases.

Differences between NUMA and MPP

In terms of architecture,NUMA and MPP share many similarities:

  1. They all consist of multiple nodes;
  2. Each node has its own CPU, memory, and I/O.
  3. All nodes can exchange information through the node interconnection mechanism.

So what’s the difference? By analyzing the internal architecture and working principles of NUMA and MPP servers, it is not hard to find the differences.

Node interconnection mechanisms are different.

  1. NUMA’s node interconnection mechanism is implemented on the same physical server. When a CPU needs to access remote memory, it must wait. This is the main reason why NUMA servers cannot achieve linear performance scaling with increasing cpus.
  2. The node interconnection mechanism of MPP is realized outside different SMP servers through IO. Each node only accesses local memory and storage, and the information interaction between nodes is carried out in parallel with the processing of nodes themselves. As a result,MPP’s performance can basically scale linearly as it adds nodes.

Memory access mechanisms are different.

  1. Inside a NUMA server, any CPU can access the entire system’s memory, but the performance of remote memory access is much lower than that of local memory access, so you should avoid remote memory access when developing your applications.
  2. In an MPP server, each node accesses only local memory and there is no problem with remote memory access.