You’ve all heard that one of the biggest advantages of Tdengine in the Internet of Things big data scenario is write speed — this is due to the unique design of Tdengine. However, some users may feel that the write performance is not as good as they expected when they first use Tdengine. Some of these users use Tdengine directly on the server, some use Tdengine on the client, and some use a variety of different connectors, can be said to be different.
So — how exactly do you solve these various types of write “slowness”? The idea we advocate is to screen out the hardware and software networks layer by layer, while ensuring that the server’s parameters and configuration have been adjusted optimally.
For users using Tdengine through connectors, the performance issues are slightly more problematic due to the more module involvement and network issues involved. But even then, the database itself needs to be well configured for performance. Previously, using “my XXXX connector writes data very slowly” was imprecise and bad for troubleshooting.
So no matter how complex your scenario is, we should first check the performance of TDEngine on the server side. The most time-efficient method is to use the official tool TaosDemo to see if the write speed is roughly the same. The advantage of doing this is that you can locate whether the problem is occurring on the database server itself in the first place. Then we can target the rest. For example, the network communication between the server and the client, and whether the client is abnormal inside.
How to use TaosDemo can be known by using “TaosDemo –help”. This tool can set the number of tables, table rows, data types, the number of rows inserted in a single batch, and even simulate out-of-order writes and other advanced features to simulate similar scenarios you will meet in the work use. After running, TaosDemo will automatically generate the data to complete the write and give the performance data at the end.
If the performance is still slow while using TaosDemo, we can do the following optimizations:
- To improve write efficiency, you cannot write only one record at a time for an INSERT, which would be a waste of resources. It is recommended that a single SQL statement write multiple records. The more records you can write at one time, the more efficient the insert will be. Typically, a single SQL is written to a single record in multiple tables, such as: INSERT INTO TB1 VALUES (….) tb2 values(….) tb3 values(…) . However, a record cannot exceed 16K, and the total length of an SQL statement cannot exceed 64K (can be configured with the maxSQLLength parameter, the maximum can be configured to 1M, details can be found in the official document). When using TaosDemo simulation, you can use the -R parameter to specify the number of rows written in a single INSERT, but be careful not to calculate the SQL length beyond MAXSQLength.
Note: TaosDemo in the latest version (2.1.0) has the default maxSQLLength of 1MB, no need to adjust.
2.TDengine supports multi-thread writing. To further improve the writing speed, a 12-core CPU client can open more than 20 threads to write at the same time. However, after a certain number of threads, the speed can no longer be improved, and may even decline because of the overhead of frequent thread cuts. When using TaosDemo simulation, the number of threads can be specified by -t, usually the number of CPU cores or double the number of cores is best.
- When a large amount of data is flooding into memory to wait for the disk to drop, the memory size setting of VNode virtual data node becomes particularly important. The cache and blocks parameters represent the size and number of memory blocks, respectively, and the product of these two parameters is the amount of memory the vnode will reserve. For many scenarios, the default number of blocks is not enough. When the write cache is not enough, the data cannot be dropped at once, and will be written to the. Last temporary file, which increases the disk IO operation. This requires you to increase the value of this parameter (a multiple of 3) according to your scenario, to see if the write speed is improved, and to see if the memory is a performance bottleneck and kept in a safe range of use, until you find the optimal solution.
Click here to view the manual for TaosDemo.
In this process, you need to keep controlling variables and observing where your performance bottlenecks appear. Is the pre-allocated memory is insufficient, or operating system memory is insufficient, or is the CPU run out, or hard disk read and write limit, and then make a targeted adjustment.
After this adjustment, it is almost sufficient for a user using the TDEngine server directly. It’s worth noting that you don’t have to focus too much on how many lines per second TDengine writes. Because the number of columns (test points) in each row and the data type of each column (test points) are not fixed.
For users using connectors or clients, if this hasn’t solved your write performance problems, let’s look at the network, client, and application layers.
Because the server depends on the network and the application or the client connection, the network issue is a performance issue that cannot be ignored. Here is a typical example:
A user encountered this situation when using the C interface to insert data: the insertion speed was sometimes fast, sometimes slow, and generally much slower than expected.
Together, we configured the memory, the number of threads, and the number of rows written per SQL. But after that, the problems remain. Therefore, we reviewed the logs together and found some warning hints about network communication. Combined with the monitoring data of the server, we preliminarily diagnosed that the reason for the slow write was the routing blocking in the network. After that, the user reconstructs the network topology using only the local virtual machine as a LAN, and the problem disappears.
In addition, because the Tdengine client has many functions (it is responsible for retrieving and caching metadata; Forward insert, query, and other requests to the correct data node; The final level of aggregation, sorting, filtering, and so on is required before the results are returned to the application. When you find that your performance is lagging, take a look at the backend of the client server and you may be surprised. Never mind – the client can also become a bottleneck in write performance. At this point, you need to increase the client server performance or number as appropriate.
Finally, the connector is logically a very thin layer, so its performance is usually affected by the characteristics of the relevant language itself, excluding the case of encountered bugs, generally can be considered that there is no room for optimization, such as: Java is slower than C.
OK, so that’s about it. Users reading this should have a rough idea of how TDEngine can be optimized for write performance. In the future, it will be possible to ask questions accurately, even if they require official technical support, which will be a very pleasant experience for both parties. For many new users of the community version of TDEngine, due to tight work schedules or other reasons, they will not have the patience to read the documentation step by step to understand the product.
At this point, this article can play its role.