This article refers to the following articles and videos:

  • YouTube – What is Non Uniform Memory Access?
  • The MySQL “swap insanity” problem and the effects of the NUMA architecture
  • wikipedia – Non-uniform memory access
  • NUMA and UEFI

In a word

NUMA refers to the memory access distance and time are different for a certain CPU. To solve the performance problem caused by shared BUS in multi-CPU system. (This may not be a serious statement, not intended to be resolved, but actually resolved.)

The NUMA architecture diagram

Start with the simplest, a CPU (note: this refers to the physical CPU, not the core. It is important to note that NUMA is for multiple physical cpus, not multiple cores. , connected to RAM via bus.

Next came multi-cpu (again, not multi-core single CPU!). If, as before, all cpus were connected to RAM through a BUS, the BUS would be a performance killer. Also, the more CPU you add, the higher the performance loss.

This is where the NUMA architecture comes into play: by treating the CPU and neighboring RAM as one node, the CPU gives priority to the RAM that is closest to it. At the same time, the CPU has a fast channel connection directly, so each CPU still has access to all RAM locations (only at different speeds).

In practice, one does not have to own a SINGLE RAM, but can have many of the following combinations:

Take a look at the NUMA architecture in Linux

The following operations are performed in the Ubuntu18.04 environment of aliyun cloud server

First of all, through the dmesg | grep -i support numa numa check system is: can be found in the current system does not support:)

Then use a tool: numactl, which can be installed by apt install numactl. Then run:

numactl --hardware
Copy the code

There is also a utility: lstopo, via apt install lstopo

lstopo --of png > server.png
Copy the code

As you can see from the figure, there is a node node. Why doesn’t my system support NUMA, Linux just lumps all the CPU and all the RAM into one node? This isn’t idle. What are you doing?

Understanding the Linux Kernel has this to say about this:

The main reason for this is the extensibility of the code, so that a set of code can run in both non-NUMA-enabled and NUMA-enabled environments.

For a numA-enabled server, the graph looks something like this:

What impact will NUMA have on Linux?

When the system is booted, the hardware sends NUMA information to the OS. If the system supports NUMA, the following things happen:

  • Obtain NUMA configuration information
  • Processors (not cores) are split into multiple nodes, typically one processor and one node.
  • Allocate memory near the processor to it.
  • Calculates the cost (distance) of communication between nodes.

If you treat the CPU and memory as black boxes and simply expect it to work, unexpected things can happen.

  • Each process, thread, or thread inherits a NUMA policy that defines which cpus (and even those cores) can be used, which memory can be used, and the degree of enforcement of the policy, i.e., priority or mandatory only.
  • Each thread is assigned to a “priority” node to run on. Threads can run elsewhere (if policy allows them to), but the OS will try to make them run on the priority node.
  • Memory allocation: By default, memory is allocated from the same node.
  • Memory allocated on one node is not moved to another node.

The above two paragraphs translated from: blog. Jcole. Us / 2010/09/28 /… If there is any ambiguity, please refer to the original text.

Look at an example of MySQL

High quality piece, have the ability to classmates or directly to see the original: best blog. Jcole. Us / 2010/09/28 /… I’m going to make a quick translation here.

A problem mentioned in this article is running MySQL on a Linux server with 64GB of memory and 2 quad-core cpus. MySQL is configured with a 48GB InnoDB buffer pool. Then you realize that even though the system has a lot of free content, a lot of memory has been swap out.

This presents a significant performance problem, because query time is when the desired content is swap out. I need to load it back. This is a problem that has vexed the MySQL community for a long time.

As mentioned earlier, Linux has a NUMA policy that can be manually controlled.

  • - localalloc, using the current node, default.
  • --preferred=node, the specified node will be used first, or other nod will be used if necessary.
  • --membind=nodes, always specify one or more nodes manually.
  • --interleaved=all, uses the round-robin algorithm to use different nodes in turn.

From a Linux OS perspective, the MySQL database is a process that preferentially runs on a node. This is fine if you’re using a small amount of memory, but when you’re using most of the system’s memory, the problem arises:

Since the OS will try to get you to run in a “preferred” node, memory will be allocated unevenly:

Node0 is almost full and Node1 has a lot left. Since node0 and node1 are independent, even though node1 has free memory, the memory in node0 will be swapped out. This is the root of the problem mentioned earlier.

So what’s the solution?

numactl --interleave all command
Copy the code

Add this to the mysqLD_SAVE statement with the –interleave all NUMA policy mentioned earlier. After that, the memory allocation is even. If the memory is sufficient, there will be no abnormal swap phenomenon.

Of course, this is only the simplest and most crude solution, and there are other better ones, which are mentioned in the original article, but are not the focus of this article, so I won’t go into them.

To summarize

Moore’s Law is failing, and when there is always a limit to CPU performance, more cpus are the future. As an aspiring brick remover, you should at least learn more about multi-CPU system architectures. As a system software developer, you should be familiar with the multi-CPU architecture, so that your applications can take full advantage of hardware benefits.

Welcome to pay attention to my wechat public number during commercial time. This article is available at github: github.com/liaochangji…