“This is the fourth day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”
1: Again, APISIX high performance practices
The acceleration of NGX. Var
The easiest way to speed up getting Nginx variables is to use the iresty/ lua-var-nginx-Module repository and compile it as a Lua Module into an OpenResty project. When we extract the corresponding ngx.var, we use the method provided in the library to obtain it, which can improve the overall performance of APISIX by 5%, and the performance of a single variable can be at least 10 times different. It is also possible to compile this module into a dynamic library and load it dynamically without having to recompile OpenResty.
APISIX gateway will obtain a large number of variable information from ngx.var, such as host address and other variables may be repeatedly obtained, each time the interaction efficiency with Nginx is low. So we added a layer of CTX cache to APISIX/ Core, which is the first interaction with Nginx to fetch variables, and will use the cache directly later.
The code in APISIX/ Core is generic and should be useful for most projects.
fail to json encode
When we encode a table using JSON, we may fail. There are several reasons for failure: for example, table contains Cdata or userdata cannot encode, or function can be included, etc. In fact, we do encode not want a result that can perfectly support serialization/deserialization, sometimes just for debugging.
So I added a Boolean parameter to APISIX’s core/json_encode to say whether to force transcoding, so that when you can’t force transcoding, it turns it into a string. In addition, table nesting is A common situation, that is, there is A table A, inside the table of A and reference A itself, forming A circular nesting. The solution to this problem is relatively simple. When nesting occurs, stop nesting after reaching a certain position. These two scenarios allow force-encode to be very useful for our development and debugging.
When debugging, if you need to hit the table result, when the log level is not enough, should not trigger meaningless Jsonencode behavior, this time recommended to use delay_encode to debug the log, only when the log really need to write to disk, will trigger JSON encode, Avoid those that do not require encode. This problem works very well in APISIX, where you can finally test different levels of logging without commenting code. It’s a bit of a C macro definition, a nice balance between performance and ease of use.
2:OpenResty Community Student: High performance practices for APISIX
Tip 9: Correct posture for Irucache
A brief introduction to Irucache. Irucache can complete the caching and reuse of data in workers. Irucache has a great advantage that it can store any object. Shared memory completes data sharing between different workers, but it can only store simple objects, and some things cannot be shared across workers, such as function and Cdata objects.
Secondary encapsulation of Irucache includes:
- Keep keys short and simple: It is important to keep them simple when writing keys. A bad design for keys is that they are long but contain little useful information. In theory, we all like to use a string for key, but it can be an object such as table. Key should be as clear as possible and only contain the content you are interested in.
- Version reduces the garbage cache: this is a breakthrough I made with APISIX: extracting the combination of version, Irucache+ version greatly reduces the garbage cache.
- Reuse stale cache data.
Version reduces garbage caching. If there is no version, we need to write the version in the key. Every time the version changes, a new key will be created. The old data that is eliminated will always exist and cannot be removed. It also means that the number of objects in Irucache keeps increasing. The previous approach ensures that if the key is an object, only one table will correspond to it. Different object caches will not be generated according to different versions, thus reducing the total number of caches.
3: shared memory of worker processes
Shared memory for worker processes
The shared memory of worker processes, as the name implies, means that all worker processes of Nginx share this memory data. If a worker process modifies this data, other worker processes will see the modified data. We can use shared memory to store cached data, improve throughput and reduce the number of requests to the back end.
1. The data in the shared memory area is shared by all worker processes. Therefore, once it is modified, all worker processes will use the modified data, and locks will be generated when worker processes simultaneously read or modify the same shared memory data.
2. The size of the shared memory area is pre-allocated. If the memory space is Used up, the Least Recently Used (LRU) algorithm is Used to weed out the data with less access.
3. Multiple shared memory regions can be configured in Nginx.
The shared memory of worker processes makes it very simple to cache data with rich instructions, but it also has some disadvantages.
1. There will be lock competition between worker processes, which will increase performance overhead in the case of high concurrency.
2. Only Lua booleans, numbers, strings, and nil are supported. Table data is not supported.
3. Deserialization while reading data increases CPU overhead.
The reload of NGX. Shared shared memory will fail
There is a global configuration in Apisix, which is done through Lrucache
Specific implementation: LRU cache
4: shared memory under the Lua module
Lua-resty-lrucache is a cache tool based on Ngx_Lua that has the following advantages.
1. Support for richer data types and the ability to store tables in values is very useful for businesses with complex data structures.
2. The number of keys can be pre-allocated, and there is no need to set up a fixed memory space, which is more flexible in memory usage.
3. Each worker process is cached independently, so there is no lock competition when worker processes read the same key at the same time.
It also has some disadvantages over Lua_shared_dict.
1. Since data is not shared among workers, it is impossible to guarantee that data will be completely consistent on different worker processes at the same time when updating data.
2. Although complex data structures can be supported, there are few instructions available, such as message queuing.
3. Cache data is lost when the Nginx configuration is overloaded. This is not the case with lua_shared_dict.
5: How to choose between shared.dict and LruCache?
Shared. dict uses shared memory, and every operation is a global lock. In a high-concurrency environment, competition between different workers is likely to occur. Therefore, a single shared.dict cannot be too large. Lrucache is used by workers. Since Nginx exists in single-process mode, it will never trigger locks, which has advantages in efficiency. Besides, Nginx does not have the volume limitation of shared.dict, and its memory is more flexible. The same cached data may be stored redundantly.
One thing you need to consider is that Lua LRU cache provides relatively few apis, only get, set, and DELETE. NGX shared dict can also add, replace, incr, get_stale (which returns the old value if a key expires), and get_keys (which gets all keys, though not recommended, if your business needs them). The second is memory occupation. As NGX shared dict is shared among workers, the memory occupation is relatively small in the case of multiple workers.
The appendix
In the preparation process is full of all kinds of exception code, can refer to
HTTP response code