Author | Ding LAN Water – Zhu Jiawei

Small T introduction: Ding LAN water is a professional environmental service company committed to urban water environment comprehensive remediation and planning, for the national high-tech certification enterprise. Founded in 2005, the company is the earliest technology-oriented enterprise engaged in the introduction, promotion and practice of urban drainage network operation and maintenance technology and equipment in China. Today, Ding LAN Water has been intensively cultivated for more than ten years, and has developed into an intelligent water enterprise integrating planning consulting, equipment manufacturing, technology research and development, system integration and operation and maintenance of the whole process personalized menu services.

The business scenario

In an intelligent drainage system, the liquid level in the drainage device needs to be monitored. Current level sensors are usually deployed in the monitored area to collect current level frequencies. The information collected by the sensor is transmitted to the gateway through MQTT protocol. After the gateway collects the information, it will be periodically pushed to the business background service for data storage and real-time analysis by the service layer. The equipment operation data is stored in SQLServer, and the unimportant but abundant data such as liquid level current frequency is stored in Taos.

The whole process of intelligent drainage system

The whole intelligent drainage system consists of several modules, including data acquisition layer, data transmission layer, data platform layer and business processing layer. The specific processing process is shown in the figure below:

In current application scenarios, massive data comes from the data reported by the shield detection device. In the processing process of these monitoring data, there will be a judgment whether real-time data after data is pushed from the gateway. For non-real-time data, it will be passed through Redis to redo, make alarm judgment and then write SqlServer. Real-time data is written directly to TDengine without Redis. Later, front-end requirements such as liquid level current data can be accessed directly from TDengine. Before this, all data are stored by SQLServer, found that the query delay of SQLServer has been very slow after the data amount reaches 20 million, have to do sub-database sub-table operation to improve the query speed, but this solution encountered cross-database cross-table query is very inconvenient.

Data modeling in TDengine

TDengine is a sequential database specially designed for structured data flow of the Internet of Things. Its database and table construction approach is completely different from that of relational database, following the principle of data flow isolation. The data generated by iot devices is a stream of data generated in chronological order. In the intelligent drainage system, we often care about the change trend or the maximum and minimum value of a certain level meter in a period of time.

When storing sequential data in a relational database such as SQLServer, it is common to put data from all devices of the same class into the same table, and each record contains the device ID, data collection (or warehousing) timestamp, collected values, and speed up the table by time range.

The disadvantage of this approach is that the query is cumbersome and inefficient. To query the data of a device from a relational database, you need to filter the data of other devices from the large table by the device ID. In addition, every device queried has the cost of filtering the data of other devices.

TDengine is designed with one data source (device) and one table, and message flows from each data source in chronological order can flow into one table without mixing with other data flows. The primary key of the table is the timestamp from which the data record was collected or entered, and the other fields are the collected values.

In this way, when querying the data of a certain device in TDengine in a specified period, the query is simplified to find the table of the device and filter the search data records according to the primary key (timestamp), which greatly improves the efficiency.

In addition, in TDengine, we create super tables to facilitate the management of static information on devices, and define device ids and groups as tags.

Specific to our scenario, our ideas for building databases and tables are as follows:

  • The database

The database is mainly created according to the service module, and the pump station, pump gate and other reported data are stored separately.

name | created time | ntables | vgroups |replica| days | keep1,keep2,keep(D) | tables | rows | cache(b) | ablocks |tblocks| ctime(s) | clog | comp |time precision| status | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =log| 19-09-06 10:36:33. 888 | 3 | | 1 | | 10 to 30,30,30 32 | | 1024 | | 2048 | | 3600 | 32 2.00000 1 | 2 | us | ready | HGM | 19-09-08 22:42:58. 873 12 | | 1 | 1 | 10 3650365 0365 | 1000 | | 4096 | 16384 | | 100 | | 3600 | 1 4.00000 2 | | ms ready | JLJ | 19-09-08 23:01:12. 251 | | | 1 | 10 3650365 0365 | | 1000 | 4096 | 16384 | | 100 | | 3600 | 1 4.00000 2 | | ms ready | bengzhan | 19-09-08 23:01:21. 895 | 93 | | 1 | 10 3650365 0365 | | 1000 | 4096 | 16384 | | 100 | 3600 4.00000 | 1 | 2 | | ms ready | bengzha | 19-09-08 23:01:29. 272 15 | | 1 | 1 | 10 3650365 0365 | 1000 | | 4096 | 16384 | | 100 | | 3600 | 1 4.00000 2 | | ms ready |Copy the code

  • Super table

The structure of the super table is very simple. The collection fields are the timestamp TS and the collection value val. However, two tags are defined here to describe specific point-point static information.

taos> describe rtdata;                             Field                              |      Type      |  Length   |  Note  |=======================================================================================================ts                                                              |TIMESTAMP       |          8|        |val                                                             |FLOAT           |          4|        |sncode                                                          |BINARY          |         20|tag     |pointcode                                                       |BINARY          |         10|tag     |Copy the code

taos> show stables; name | created time |columns| tags | tables |====================================================================================================================rtd Ata | 19-09-09 18:15:42. 243 | 2 | 2 | | 12Copy the code

  • Regular tables

taos> show tables; table_name | created time |columns| stable | = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = sch_8_n_level | 18:24:22. 19-09-09 655 | 2 | rtdata | sch_7_n_level | 19-09-09 18:24:22. 620 | 2 | rtdata | sch_6_n_level | 18:24:22. 19-09-09 585 | 2 | rtdata | sch_29_n_level 11:35:01. | 19-10-30 | 292 2 | rtdata | sch_5_n_level | 19-09-09 18:24:22. 550 | 2 | rtdata | sch_4_n_level | 18:24:22. 19-09-09 514 | 2 | rtdata | sch_30_n_level | 19-10-29 14:54:53. 446 | 2 | rtdata | sch_3_n_level | 18:24:22. 19-09-09 479 | 2 | rtdata | sch_2_n_level | The 19-09-09 18:24:22. 442 | 2 | rtdata | sch_10_n_level | 18:24:25. 19-09-09 338 | 2 | rtdata | sch_1_n_level | 19-09-09 18:24:22. 408 | 2 | rtdata | sch_9_n_level | 18:24:22. 19-09-09 691 | 2 | rtdata |Copy the code

The advantage of TDengine

High write speed

TDengine’s data writing performance is very high. The current number of devices in the access gateway is 23 levels, 9 frequency meters and 9 current meters. A total of 10 shields and 3 pumping stations. Three interceptor Wells. One pump gate, 40 tables in total, each table adding 60,000 pieces of data per day, now about 3 million pieces of data per table. The biggest benefit of using TDengine is that there is no need to consider the database and table operations in SQLServer, and the query delay does not increase after data is continuously written for a month.

Down-sampling query

TDengine provides a very useful feature for monitoring device data analysis – interval, that is, sampling down by time window. For example, in a drainage system, we often need to calculate the level of the shield every 20 minutes. This analysis can be easily implemented by the following SQL statement.

select avg(val) from sch_3_n_level where ts > '2019-10-15 00:00:00' and ts < '2019-11-15 00:00:00' interval(20M);Copy the code

The latest data show that

In the current monitoring system, the need for a large screen real-time display of the latest level meter readings. This is where TDengine’s built-in caching comes in. TDengine allocates a certain amount of cache space in memory for each table to store the hottest records, so real-time query efficiency is very high. The query syntax uses last and last_row.

select last_row(*) from rtdata group by sncode;Copy the code

Then, VUE and Hightopo are used to construct the configuration visual interface to display the latest liquid level data in real time. The gateway outputs the latest liquid level, the back end pushes it to the front end, and hightopo displays the liquid level on the 3D page. The result is also very cool.

Low memory and high compression ratio

In use, TDengine was found to handle 13-device, million-magnitude data write tasks with a memory overhead of only 1.5GB. Memory growth is not noticeable when querying. The overall memory overhead is 50% lower than SQLServer.

After the data falls to the disk, check the size of the data file in /var/lib/taos. The original 5GB data is only 80MB after being compressed by TDengine, and the compression ratio is 1.6%.

Vision of the future

Overall, TDengine is excellent at handling the throughput, resource overhead, and compression of structured sequential data collected at high frequency. However, TDengine is not able to do deduplication and alarm like Redis at present. In this part, we will try to use TDengine’s own streaming computing and subscription to try to achieve alarm, to see if it can further simplify the architecture and save the work of Redis and SQLServer. Hopefully TDengine will have more support in this area and it will get better and better!

Author profile: Zhu Jiawei, senior full-stack R&D engineer of Dinglan Water, open source community lover, currently mainly responsible for the development and maintenance of real-time water level monitoring system in dinglan Water intelligent drainage system.

Originally published: blog.csdn.net/weixin_4399…