Introduction | Tars is dominated by tencent open source, and donated to the Linux foundation micro service RPC framework. The launch of TarsBenchmark is a further improvement of the Tars ecosystem. It supports online pressure measurement function, which greatly reduces the threshold for development testers to use online evaluation service performance. This article is about Chen Linfeng, an expert engineer of Tencent, sharing and organizing in cloud + community salon online. I hope to communicate with you.

Click this link for the full replay

What is TarsBenchmark

1. Common pressure measuring tools

We usually do some performance evaluation before some apps in the service background go online, and then we evaluate them. For example, how many users can the project serve and how much concurrency can it handle?

(1) Apache Bench

The classic performance evaluation tool AB, also known as Apche Beach, is a pressure measurement tool of HTTP protocol service, as well as a classic pressure measurement tool of many WebApps. Developed in C language, it is easy to use and has good performance.

However, it also has shortcomings, mainly lies in its use of single-thread mode, and many of our backend servers are multi-core, so it can not play the full performance of our servers.

(2) the Wrk

A new pressure measuring tool, Wrk, compensates for AB’s single-threaded design. Its network is event-driven and has a good performance in network IO.

It also supports scripting languages. You can generate some random content in Lua. At present, most background services are deployed in distributed mode. If only one request is continuously replayed, it is easy to cause high load on a single machine and cannot reflect the service capability of the real cluster. WRK has been improved in these two aspects, so it is very widely used.

(3) the GHZ

In the microservice, there is also a typical pressure tool, GHZ, which is a pressure tool under the GRPC framework. The development language is Golong, so it has good concurrency.

Its use cases are organized in JSON format, and it takes advantage of PB’s good reflection characteristics to easily convert JSON use cases into binary mode required for network transmission, so it is also very convenient to use.

(4) JMeter

Finally, WE will introduce JMeter tool, which is developed in Java language, provides graphical interface interaction form, supports distributed deployment, can make up for the lack of stand-alone performance.

However, JMeter is similar to a threading model, simulating a user request with one thread. In fact, when a certain number of threads accumulate, thread scheduling and contention will bring some extra CPU overhead, so its stand-alone performance is insufficient.

2. What is TarsBenchmark?

TarsBenchmark is one of the most popular tools for benchmarking.

(1) Single-machine pressure measuring tool

TarsBenchmark is a pressure measuring tool based on Tars ecology. It mainly serves Tars Service, which is Tencent’s open source micro-service framework with multi-language, high performance and easy operation and maintenance. Its communication protocol uses Tars, which is a transmission protocol similar to PB binary. And it was completely developed by Tencent itself. The number of developers currently involved is approaching 300, and you can enter TarsCloud on Github to learn about Tars in the official introduction article.

Tars supports a variety of development languages such as C++. Compared with other development frameworks such as PRPC, Tars also provides a service platform, which can help developers and enterprises quickly build stable and reliable distributed micro-service applications. Developers only need to focus on business logic, thus improving r&d and operation efficiency.

(2) A cloud pressure measurement platform

Not only can TarsBenchmark run on a single machine, it also provides a cloud-based Web platform for TarsService benchmarking, which supports distributed benchmarking, allowing developers to easily and easily evaluate the performance of Tars services.

(3) Not only ForTarsService

It can also be easily satisfied when your team uses non-TARS protocol services. Later in TarsBenchmark, I will show you how to develop non-TARS protocols. It is easy to use our tool to meet the pressure of third-party protocols.

3. What problems have been solved?

So what problems does TarsBenchmark solve? Mainly lies in the following three points:

  • High performance: give full play to multi-core CPU computing capacity, 8-core machine can output 40Wtps capacity;

  • High scalability: support any Tars interface pressure test, friendly support non-TARS protocol;

  • Simple and easy to use: Use cases in JSON format, easy to write, support online cloud pressure measurement.

TarsBenchmark principle

1. Principle of tool pressure measurement

(1) Multi-process

First to introduce the implementation of high-performance design, we also analyzed AB and Wrk in the above, Wrk upgrade AB mainly lies in the use of multi-thread implementation, and TarsBenchmark is also a multi-process way.

On the main process, we will fork the same number of pressure processes according to the number of physical cores of the server. Each pressure process is completely isolated from each other, that is, to run independently, avoiding the process competing with each other for critical resources. By default, a process is forked based on the number of valid CPU cores.

(2) Network processing

In the network aspect, the event-driven mode is adopted, which is to send packets through a timing packet sender and send and receive packets based on network events, effectively avoiding network IO congestion.

In addition, we use connection pool in connection pool, and use connection multiplexing for each connection. When we analyze the principle of AB pressure measurement, we find that: when AB link is established and a message is sent out, the next message will be initiated only after receiving the message. Therefore, in this link, there will be waiting operations, and the link cannot be fully utilized and QPS can be generated uniformly.

TarsBenchmark is based on connection reuse and does not depend on whether the peer returns. For the scenario of external service return, when we use AB pressure test, sometimes we will find that the return of the server is very timely, so at this time AB can achieve a high output capacity; However, when the return time of the server is high, the AB output capacity will be reduced by a multiplier level.

For example: CGI returns in 20ms and 200ms, the same number of links can affect a ten-fold difference in performance output, so we feel that the link design approach is wasteful.

TarsBenchmark is designed with this in mind. Based on the protocol packet number, it can select the time when the server sends the next packet regardless of the packet return time. By returning the packet number, it can locate the time when the packet is sent and calculate the status information during the pressure measurement.

(3) Multi-dimensional monitoring

There are some communication between the pressure measuring process and the main process through the lockless queue to complete the information exchange, including the time consumption statistics and error code statistics mentioned above are completed through the lockless queue.

(4) Protocol extension

In addition, for the protocol, we use the protocol proxy factory mode, TarsBenchmark provides Tars protocol pressure test by default. However, if you have some proprietary protocols, you can refer to the Tars protocol implementation, or you can integrate it, and the agent factory can automatically recognize the added protocols.

(5) Support random data

TarsBenchmark supports random data generation in Tars protocols, as well as random functions to avoid the need to replay unique requests during the process.

(6) Generate use cases automatically

In the Tars protocol section, we also provide a tool to automatically generate test cases required by the Tars service. TarsBenchmark test cases are in JSON format, and users can easily initiate pressure tests by simply editing the Value.

2. Principle of distributed pressure measurement

Distributed pressure measurement is mainly to solve the problem of multiple people using the platform at the same time. Its main components are composed of four parts.

The first part is the WebUI portal, which integrates with Tarsweb and provides a very simple set of interactions.

The second part is CGI, mainly do the management of test cases, including database CRUD operation, permission control.

The third and fourth parts are background services, including a pressure Admin service and Node service (pressure Node). The Node service is responsible for performing specific pressure test tasks, which are automatically destroyed by the system after the tool is used. In distributed pressure test, the system relies on the pressure test admin to destroy tasks.

To reduce complexity, TarsBenchmark is designed to be multithreaded (Linux systems schedule CPU resources in threads). Pressure admin will accept CGI instructions, such as the user in the input QPS scheduling required pressure node, because the minimum unit of pressure node scheduling is thread, a thread can be set to 30,000 to 50,000 output capacity, and then according to this single thread capacity we need to calculate the number of threads, Finally, assign it to our pressure node to perform the pressure test.

TarsBenchmark will store these data in Admin. The Admin service is generally recommended to be deployed in active/standby mode. Distributed test cases are written in JSON format, and a tool in the background can automatically generate demo cases for each interface. Use cases are also very simple to write. TarsBenchmark also provides a test function that you can use to verify that the current function is working properly.

3. Protocol conversion

We know that Tars is a binary, extensible, cross-language protocol that supports C++, PHP, Java, and Node, and is essentially a TLV protocol. But the whole manometry, if the organization is JSON, how to convert to binary? This is probably one of the most difficult things to do, so how does TarsBenchmark solve this problem?

First of all, let’s analyze the protocol encoding and decoding principle of Tars. T includes Tag and Type. When transmitting, there are four bits in front of binary, representing a Tag. Next comes the Type that represents the data. The Tars protocol supports up to 16 data types, 14 of which have been programmed so far. This was designed in this way at the very beginning of Tars. At present, the type of our data does not exceed our code, indicating that this kind of design at that time still has certain foresight.

There are seven basic types and three complex types. In common, for example, two nested structure as an example, it is made up of binary results as shown in the above, the right is the final coding out a result, the first of all we see is B, the effect of structure of binary coding, starting from the first 0, represents it’s this a variable of a Tag, we find a, We find that it is the beginning of the structure, and when we enter the structure a, the first digit is also 0, and the tag corresponds to the int data, and then the two bytes of data, and then the four bytes of data are 1, 2, 3, and 4.

16 we can see that 1 is its bits1, corresponding to the variable s, 6 is its real jun, and its byte length is 4, we can see that it follows 616, 63, and 64, corresponding to abcd, and the 0p terminates to indicate that the structure is finished. Next up is 14, where we see that 1 represents this float data type, and then 4 corresponds to four bits of data, deserialized to a structure like this, which corresponds to our middle structure.

If the edit binary to our Tars pressure test, I think most people would collapse, a simple structure may also can write, but when you are a member of the structure a lot or more than ten, or structure inside and nested structure, estimates that most development will be crazy, so use this way to edit is obviously not desirable.

So what does TarsBenchmark do? According to the Tars interface file, using IDL tool, we will first generate a description file with this syntax number, which contains three parts: Tag, Type and Name. For Web platforms, which are stored in DB, Tars using RPC calls has a layer of head in addition to the body, which has four key pieces of information: a request ID, service name, function name, and parameter binary body.

In the implementation TarsBenchmark looks up the Name of the description file, obtains the Tag and Type from Name, obtains the Value from the use case, and writes it to the binary Buffer using Tars codec rules. Finally, the JSON to Tars conversion is completed.

In fact, its reverse conversion is similar. We first restore it from binary to Tars Tag and Type, then we find Type and Name in the description file through Tag, and then restore the JSON format through Name and Value. The above is the protocol conversion process of an RPC call process on the client side.

4. Third-party Service pressure test

If you use a third-party protocol, it is actually very simple, implementing four function interfaces to complete the support for non-TARS protocols.

These four function interfaces are respectively initialization interface, broken packet interface (using TCP long link streaming format, response packet to which end, in the input function to achieve it), encoding interface, decoding interface.

Request coding interface implementation, which has a very important parameter ID, our system will generate an ID each request, if this interface support can return the ID back.

In our Tars example, we fill in the HEAD with an ID. When the RPC response comes back with the same ID, TarsBenchmark calculates the request time and success rate based on the response. If the third-party service does not support returning the serial number, return false in isSupportSeq() and it will go into unordered mode and cannot reuse the link.

That’s how TarsBenchmark is designed.

TarsBenchmark

1. Code directory structure

Source path:

Github.com/TarsCloud/T…

.

Take a look at TarsBenchmark’s code structure. It’s divided into four sections, starting with tools, services, public modules, and resource files. The common module is mainly introduced, which consists of three parts: protocol, network and monitoring.

  • Protocol support: Two protocols are supported by default: HTTP and Tars.

  • Monitoring: pressure measurement common data are temporarily stored through the monitor, lock – free queue communication is also achieved here.

  • Network: event-driven, non-blocking Socket mode design.

2. Code compilation

Code compilation is mainly two parts, one is TarsCpp, if there is a cloud pressure requirement, TarsWeb is also a dependency.

The steps of compilation are also very simple, first of all, the code is cloned, a new temporary directory through cmake to complete compilation, you can generate three programs, respectively is TB, Nodeserver and AdminServer. Most of the time, the minimum requirements only need a tool TB, the tool can be directly executed in the single machine, and when you have cloud pressure testing requirements, execute the install script, one click to install to Tarsweb.

There are several main parameters here, one is the web host, AdminServer recommended to deploy with Tars base service, because it does not consume too much CPU, but nodeserver is different, recommended to deploy separately, and it can scale horizontally.

3. Single tool pressure test

For single-machine tools, the following parameters can be adjusted based on actual conditions:

  • -s is the maximum QBS limit. If -s is not specified, it will try to detect the maximum performance of the server to implement the pressure test.

  • -d and -p pressure the port number of the target server. Multiple IP addresses can be executed for the target server and are separated by commas (,).

  • -n: Specifies the number of pressure testing processes. -t Specifies TCP or UDP. The default is TCP.

  • -p: indicates the communication protocol of the interface. If it is a private protocol, change it to the name of the private protocol.

Each pressure measuring cycle, there are some time, the success rate of statistics, the distribution of average time consuming P99 or P90 this statistics, print can be periodically, avoid the pressure measurement process of don’t understand the current situation, and for measuring the pressure of time also can use the -i parameter to specify.

4. Cloud distributed pressure measurement

Air pressure test is very easy to use and has a pressure in TarsWeb interface debugging test entrance, use cases can be seen, when you add automatically generates these use cases, when you have pressure test, you only need to specify the target IP, QPS can be initiated by pressure measurement anytime and anywhere, and you can see the service performance in the process of pressure test.

TarsBenchmark results

1. Development history

We opened source in Tencent in 2016. At the beginning, it was just a tool and supported few protocols, mainly Tars, which was internally called TAF protocol.

After 2018, the number of supported protocols has become very rich. Previously very old services can be supported by this tool, and it is still widely used internally. In addition, it supports cloud Web pressure measurement, which is used by more than hundreds of people every week. You don’t need to log in to THE IDC machine, and you can initiate at any time and anywhere.

In April this year, we also contributed it to the open source community TarsCloud, for the benefit of the community partners, welcome your comments.

2. Type comparison

Currently use very good response, and relative to other open source tools, this tool is developed using c + + language, there are two kinds of model, is a kind of multiple processes, tools using the model of multi-process, up in the air pressure test is to use the thread pool model, support distributed pressure measurement, single performance relative AB and Wrk also has a good performance.

Most importantly, TarsBenchmark supports uniform packet delivery because it is based on connection multiplexing and does not rely on request returns for continuous delivery. It also supports random content generation. It also supports online pressure measurement, which is an important feature of TarsBenchmark.

That’s all for today’s sharing. Welcome to participate in ecological contribution. In addition, if there is a need for our position, you can directly send your resume to my email: [email protected].

Five, the Q&A

Q: What is the difference between WRK and TarsBenchmark? How to do pressure measurement selection?

**A: ** This is A good question. We said that WRK is relatively the same as TarsBenchmark’s network model, which is based on multi-threading and network add-ons. However, WRK is a pressure measuring tool for pure HTTP protocol. HTTP protocol can be understood as a disorderly mode, which cannot achieve link reuse, performance has some constraints, and can not achieve uniform transmission.

On the other hand, WRK is primarily a pure HTTP protocol, and it doesn’t support proprietary protocols, which are a bit of a struggle to use. TarsBenchmark does a better job of extending the protocol.

Q: What is the amount of TarsBenchmark code we have now?

**A: ** We are still very lightweight, and we count several thousand lines of valid code.

In my impression, it seems that between 3000 and 4000 lines is a project with very little code and a gentle learning curve. It is not too complicated. As long as you have some basic knowledge of C++ language, you can learn it quickly.

Q: What is the difference between TarsBenchmark and TarsJMeter?

* * A: * * just warming up when they said, we are using thread JMeter model to simulate the user, in a thread to execute a user’s request and response, this kind of thread scheduling overhead is very large, you imagine in a machine on the scale of ten thousand threads, thread between because of thread scheduling dependent on the operating system, Therefore, network models like JMeter’s will not perform well.

Of course, some people have made an extension to this JMeter, because the biggest application of JMeter is very good scalability, it supports distributed pressure measurement, it can be distributed, which is a relatively big advantage of JMeter.

In addition, its reporting is based on the design idea of plug-ins, which is quite advanced. For example, we can report some data of our results to other UI languages for presentation. This is a great advantage of JMeter. The result is TarsBenchmark.

In summary, TarsBenchmark performs better on a single machine.

Q: Why not use Golong

**A: ** this has some background. We first made this product in 2015 and 2016. At that time, Tars was mainly C++.

Through comparison, it is found that the performance of C++ is a little better than golong’s network model, mainly reflected in the encoding and decoding of a little more efficient.

Q: What interfaces are supported by Tars? Is HTTP supported?

**A: ** This is supported, you can write your own interface with the latest version of Tars framework, so TarsBenchmark itself supports HTTP benchmarking.

Q: Does kafka support pressure measurement?

* * A: ** Kafka components have their own benchmarking tools, in fact, I personally recommend that you use the framework component benchmarking tools can be satisfied, of course, you are interested in our TarsBenchmark, as I just said three functions, four functions. You can also complete a pressure test of your Kafka extraction.

Q: Why is it so efficient? Can it be improved?

**A: ** Just now IN the tool I also introduced to you, mainly four aspects, one is multi-process, give full play to the CPU computing efficiency. The second thing is that we use this kind of unassembled network model, and the third thing is based on the way that the network links are attached so that the links don’t sit idle there, and the fourth thing is that the so-called communication is non-blocking.

Is there any way to improve? We are discovered, such as our Tars in complex interface of decoding efficiency this will still has some problems, so if there’s one thing to improve, is mainly at decoding this want to consider the idea of zero copy, our operating system and the compression tools between zero copy to buffer operation as far as possible, including the binary protocol in the case of Tars, We converted it into Tars and tried to use more efficient codec efficiency tools. Now we are trying to make further improvement.

Q: TarsBenchmark has seen relevant products and solutions on Tencent Cloud. Will they be added to Tencent Cloud in the future?

**A: ** Currently, it is mainly co-built in the form of open source and provided in the way of open source. Currently, there is no plan to go into the cloud. If the Tars ecosystem is adopted, we can provide it to A large number of users for free, including enterprises and individuals.

Q: is it C++ 11?

**A: ** this is correct, we use C++ 11 standard to complete. In addition, I am also entrusted by Tarscloud. If you are more interested in Tars, you can pay attention to its official website. Its source code is in the PPT just now. The code structure can be found in our last episode, which is a directory of Tarscloud capabilities and its parameters, and you can find the appropriate material to study or experiment with.

Q: Can big data products be compacted?

**A: ** By big data products you mean pressure measurement of IDP? At present, there is a communication protocol for this kind of big data, and if you grasp the rules of the communication protocol, or if you understand the way some of its protocols are designed, it is also possible.