Han Wei joined netease in 1999 as an intern and became the 30th employee. He has been working as project manager and product director for 8 years since he started as a programmer. In the four years since 2007, he has developed video live communities and many page games products. After 2011, I worked in the architecture planning group of public Technology Center of Tencent Game R&D Department, focusing on the research and development of general game technology.
In the field of server-side program development, performance has always been the focus of concern. There are a number of frameworks, components, and class libraries that are known for selling performance. However, the basic thinking that server-side programs should have on performance issues is rarely addressed in the documentation of these projects. This article formally aims to introduce the basic strategies and classic practices for server-side performance problems, which are divided into several sections:
-
Concepts and examples of caching policies
-
The difficulty of cache policy: the clearing mechanism of cache data with different characteristics
-
Concepts and examples of distribution strategies
-
The difficulty of distribution strategy: balancing shared data security and code complexity
The concept of caching policies
We get confused when we talk about server-side performance issues. Because when we access a server and the service gets stuck and can’t get data, we think of it as a “performance problem.” However, this performance problem may actually have different causes, which all manifest as long delays or even interruptions to customer requests. Let’s take a look at some of these reasons. The first is the so-called insufficient concurrency, which means that there are too many requests from clients at the same time, so that the clients that exceed the capacity are denied service, which is usually caused by the server running out of memory. The second is long processing latency, that is, some customer requests take longer to process than the user can tolerate, which is often represented by 100% of the FULL CPU usage.
We are in the server development, most commonly used to have the following kinds of hardware: CPU, memory, disk, network card. The CPU represents the processing time of the computer, the hard disk space is generally large, mainly read and write disk will bring relatively large processing delay, and memory, network card is limited by the capacity of storage, bandwidth. So when our server performance problems, is this several hardware a certain or even several load full situation. The resources of these four hardware can be abstracted into two categories: one is time resources, such as CPU and disk read and write; One is spatial resources, such as memory and network card bandwidth. So when our server performance problems, there is a basic idea, is – time space conversion. We can cite several examples to illustrate this problem.
A dam is an example of trading reservoir space for flow time
When we visit a WEB site, the URL we enter is changed by the server to read a file on disk. If a large number of users visit the site, each request will cause a read operation to the disk, which may overwhelm the disk and make it impossible to read the file content in real time. But if we write a program that keeps the contents of a file that has been read once in memory for a long time, when there is another read of the same file, it returns the data directly from memory to the client, without having to let disk read it. Because the files accessed by users tend to be concentrated, a large number of requests can be found in a saved copy in memory, greatly increasing the number of visits the server can handle. In this way, memory space is used to exchange for disk read and write time, which belongs to the space-for-time policy.
Instant noodles are pre-loaded with a lot of cooking
Take another example: we write a server-side program for an online game that provides an archive of player data by reading and writing from the database. If a large number of players enter the server, there will be a lot of changes to the player’s data, such as leveling up, acquiring weapons, etc. These operations, which are performed by reading and writing to the database, may overload the database process and make it impossible for players to complete the game in real time. We will find that most of the read operations in the game are needle to some static data, such as the level data in the game, the specific information of weapons and items; Many write operations, in fact, will be overwritten, such as my experience value, maybe every hit a monster will increase dozens of points, but the final record is only a final experience value, not record every process of fighting monsters. So we can also use a space-time translation strategy to provide performance: we can use memory to read and save all the static data in the game at once, so that each time the data is read, it is not related to the database; And the player’s data, it is not change every time to write the database, but to keep a player data in the memory copies of all write operations to write an in-memory structure first, and then on a regular basis by the active written back to the database server, so I can write multiple database operations into a write operation, can save a lot of writing the consumption of the database. This is also a strategy of trading space for time.
Assembling furniture saves shipping space, but it takes time to install
The last example is time for space: Suppose we want to develop a data storage system for the corporate address book. The customer requires us to be able to save the history of all changes in the address book every time it is added, modified or deleted, so that the data can be pushed back to any past point in time. So the simplest thing we can do is make a copy of this data whenever it changes. However, this can be a waste of disk space, because the data itself may change only a small part, but the copies to be copied may be large. In this case, we can write down a record every time the data changes, the content is the data change: insert a content is a contact method of XXX, delete a contact method of XXX… In this way, the data we record is only the part that changes, and we don’t have to make many copies. When we need to restore to any point in time, we only need to modify the data one by one according to these records until the record at the specified point in time. This recovery may take a bit longer, but it can save a lot of storage. This is the strategy of trading CPU time for disk storage. MySQL InnoDB logging tables, as well as SVN source code storage, use this strategy.
In addition, when our Web server sends HTML file content, it usually uses ZIP compression first and then sends it to the browser. The browser needs to decompress it before it can be displayed. This is also the CPU time of the server and client in exchange for the space of network bandwidth.
In our computer system, the idea of cache is almost everywhere, for example, our CPU has level 1 cache, level 2 cache, they are to use these fast storage space, in exchange for the relatively slow storage space of memory waiting time. Our display card also has a large cache, which is used to store the results of the display graphics.
Suburban roads leading to large Spaces are prone to traffic jams
The essence of caching, in addition to “already processed data, do not need to repeat processing”, there is “fast data storage read and write, instead of slower storage read and write” strategy. When we choose the cache strategy for spatio-temporal transformation, we must make clear whether the time and space we want to transform is reasonable and whether the effect can be achieved. For example, in the early days, some people would cache WEB files on distributed disks (such as NFS), but since accessing disks over the network is a slow operation in itself, and can take up network bandwidth space that may not be sufficient, performance can be even slower.
Another risk we encounter when designing a caching mechanism is the programmatic handling of cached data. If we want to cache data, it is not directly read and write, but need to read into memory, in some language structure or object to process, this needs to involve “serialization” and “deserialization” issues. If we cache data directly by copying memory, when our data needs to be accessed across processes or even languages, those pointer, ID and handle data will be invalidated. Because in another process space, these “token” data do not exist. So we need a deeper approach to data caching, and we might use what’s called a deep copy scheme, where we follow those Pointers to find the data in the target memory and copy it all together. A more modern approach to this problem is to use a serialization scheme, in which a structure is defined in a well-defined “copy method” so that the user knows that the data will be copied, eliminating the memory address data such as Pointers. For example, the famous Protocol Buffer can easily cache memory, disk, and network locations. JSON, which is now common, is also used by some systems as a data format for caching.
However, it is important to note that the cached data and the data that our program actually operates on often require some copying and operation. This is the process of serialization and deserialization, which can be very fast or very slow. Therefore, when we choose the data cache structure, we must pay attention to its conversion time, otherwise your cache effect may be the data copy, conversion consumption, or even worse than no cache. In general, the faster the cached data addresses the memory structure used, the faster it can be converted. In this regard, the Protocol Buffer is encoded in TLV, which is not as fast as a C structure of direct memCPy, but is much faster than XML or JSON encoded as plain text. Because the process of encoding and decoding often need to carry out complex lookup table mapping, list structure and other operations.