Small T takeaway
The program design
The whole data collection service is developed based on MQTT protocol. The configuration and service discovery are realized through Alibaba nacOS cluster. The business data of other micro-service modules are called through Feign mode to conduct business analysis on the reported data, and the analytical results are stored in the point table corresponding to the corresponding monitoring indicators. The collection service adopts the multi-data source configuration of Mybatis – Plus, and uses the HikariCP data connection pool with the best performance at present. Two database connections of MySQL and TDengine are configured respectively to realize the seamless switch of multi-data sources. The whole collection service is based on container deployment, using Kubernetes container arrangement, through the distributed lock method to avoid repeated data entry, improve the speed of data analysis, realize the distributed group collection of MQTT.
Database selection
Our company compared Ali’s temporal and spatial database TSDB, traditional MySQL and TDengine.
-
MySQL: The query efficiency is low, and the database and table need to be divided according to the data time. The application scenario is not suitable, and there is no efficient query based on the collection frequency cycle.
-
TDengine: High query efficiency, no need to separate tables, with time dimension aggregation, streaming computing, caching and other features.
The database |
The query efficiency |
Application scenarios |
Flow calculation |
MySQL | low |
transactional | Does not support |
TDengine | high | The sequential type |
support |
Compare with other timing libraries
-
Ali TSDB: Closed source, privatization landing is more difficult
-
TDengine: Cost-effective, open source, supports cloud and on-site deployment
Especially TDengine a device a table to build a table idea, and our programming way is very fit. TDengine also has a super table design that can label different devices, such as customer ID, device ID, indicator name, etc., making it very convenient to manage multiple timelines. The entire database is also SQL interaction mode, support standard JDBC interface, can seamlessly connect to MyBastis ORM framework, very convenient development. Therefore, we finally chose TDengine cloud service as the storage database of various monitoring indicators, and it took about a week to put the system online.
Build table thinking
We are currently using the schema of the super table for data construction, and our child table is point_#{collection point unique code}. Labels are the unique identifier of the customer, unique identifier of the device, collection indicator NUMBER, indicator name, and measurement unit name.
Creating a supertable
create table point(ts timestamp,val float,flag bool) tags(cusid bigint, devid bigint, code bigint, pointname binary(20),unitname binary(20));Copy the code
Create subtables based on the super table
create table point_0301 using point tags(10000000, 301, 107, 'DUST'.'mg/Nm3'); create table point_0302 using point tags(10000000, 302, 107,'DUST'.'mg/Nm3'); .Copy the code
Such a design is very natural to collect data for the table. When querying, you can query from the super table instead of each subtable. In addition, you can select sub-tables required by specific services based on customers and collection volume during super table query. The label of a supertable is the convenient addition of an index to a child table. Super table label is support to add and delete, it has yet to finalize the design for our business logic is a good news: behind business needs change, such as the need to add a “city” for each equipment analysis dimensions, we no longer need to update all the data again, and only use them to add a label to each table.
Convenient streaming computing
Time dimensional aggregation
The business needs to aggregate and plot the year, month, day, minute, etc. of the original second level data. This requirement can be addressed directly in TDengine by sampling queries in time. When calculating the concentration exceeding standard, real-time calculation and monitoring of the average value per minute are needed. For example, in the monitoring of pollution emissions, the business requires real-time monitoring of the concentration of NOx, and allows users to select a time range to display the concentration curve and the length of the aggregation window (5 minutes, 15 minutes, 30 minutes, day, month, year). Since the data transmitted to the front-end is already calculated by the sequential database, the data volume is generally not so large, which greatly improves the efficiency compared with pulling all the original data points to the front-end for further processing.
select val from point_0301 where ts<'2020-05-30 00:00:00. 000' and ts>'2020-05-29 00:00:00. 000'interval(5m); Select val from point_0301 every 5 minuteswhere ts<'2020-05-30 00:00:00. 000' and ts>'2020-05-29 00:00:00. 000'interval(15m); Select val from point_0301where ts<'2020-05-30 00:00:00. 000' and ts>'2020-05-29 00:00:00. 000'interval(30m); // Aggregate every 30 minutesCopy the code
Flow calculation
For real-time calculations of the average volume per minute, the database can return the average volume per minute by simply defining the time window and sliding increments. For the indicators of real-time monitoring and early warning, the flow calculation can be specially established for this kind of data, and the calculation results can be written into a new table (strM_pT_0304 below) for storage, so that the results of the whole real-time calculation can also be historical review.
create table strm_pt_0304 as select avg(val) from point_0304 INTERVAL(1m) SLIDING(1m); // Calculate the average of each minute once per minuteCopy the code
Many of the numerical calculations that used to be handled in the program are now handled entirely by TDengine. On the one hand, the computational burden of the program is shared. On the other hand, more importantly, the aggregated results can be stored automatically and continuously, supporting the historical data to be viewed in real time.
A summary of the TDengine experience
The open source version of TDengine can do local test adaptation, the installation package is only 5MB, very easy to use, and the various interfaces are exactly the same as the enterprise version. But I hope to purchase the enterprise version of the customer, to provide a compatible open source version of the Web version of the query tool, convenient debugging.
Right now TDengine has fairly good support for Java drivers, but hopefully there will be more driver support for.net Core versions.
About the author:
Wang Fei, kunyue Internet of Things technology expert, has many years of.net/Java software design and development experience, familiar with industrial monitoring, environmental monitoring and other business modeling, led the overall design and development of Kunyue Internet “A environmental protection” project.