This article first public number “code farmer Mr. Wu”, welcome to pay attention to, timely access to more technical work ~

Traffic replication is often used in the testing of a quasi-production environment, where online traffic is copied into a quasi-production service to test the bearing capacity of new functions and services. Traffic replication can fully simulate online traffic and perform real service testing on complex business scenarios without any side effects on production services.

For complex traffic replication application scenarios and requirements, a complete replication architecture can be developed, which can be referred to ByteCopy project developed by byte Group. For some simple needs, open source tools can basically handle it. There are many open source traffic replication tools, commonly used goreplay, TCPreplay, tcpCopy and so on.

This paper mainly to explore the tcpCopy and goreplay scheme implementation, nonsense not to open the whole.

The tcpCopy scheme is implemented

Tcpcopy profile

Tcpcopy was developed by Wang Bin, netease Technology Department, and opened source in September 2011. Tcpcopy latest architecture as follows (from the author from the blog: blog.csdn.net/wangbin579/…

Tcpcopy consists of two main components: tcpCopy Client and Intercept. The client is responsible for copying traffic and forwarding, while the Intercept is responsible for intercepting the response traffic and tcpCopy’s link processing.

Tcpcopy structures,

The example environment is as follows, and the building process of the whole architecture is described below:

  • 192.168.33.11 Production server where the TCPCopy Client is deployed
  • 192.168.33.12 Secondary server where Intercept is deployed
  • 192.168.33.13 Test server

Each component can be directly downloaded from Github source package, compile and install:

Tcpcopy client 192.168.33.11Wget https://github.com/session-replay-tools/tcpcopy/archive/1.0.0.tar.gz tar XVF 1.0.0. Tar. GzcdTcpcopy-1.0.0./configure --prefix=/opt/tcpcopy make make installLibpcap is the libpcap library used to capture packets
yum -y install libpcap-devel
# ubuntu
# apt install -y libpcap-devhttps://github.com/session-replay-tools/intercept/archive/1.0.0.tar.gz tar XVF 1.0.0. Tar. GzcdIntercept -1.0.0./configure --prefix=/opt/tcpcopy/ make make installCopy the code

After installing intercept, start intercept by running the following command:

/ opt/tcpcopy/sbin/intercept -i enp0s8 -f 'TCP and SRC port 8000 - d # - I, specify the network adapter enp0s8 # -f, filtering, grammar and pcap caught tools have been, For example, tcpdump # -d is started as a domain. Other parameters can be viewed by -h.Copy the code

After Intercept is started, tcpCopy Client is started. Tcpcopy relies on Intercept to start and ensures that Intercept starts successfully.

/ opt/tcpcopy/sbin/tcpcopy - 8000 - x 192.168.33.13:192.168.33.12-8000 - s c 192.168.1. X - n - d # 2 - x, copy the flow of local port 8000, Port 8000 of the machine 192.168.33.13 # -s, secondary server Intercept address # -c, modify the original address of the packet forward to the address of this address segment, here can also be an explicit IP. This IP end is used to disguise data packets and facilitate the intercept to do route hijacking. # -n, traffic multiple # -d, runs as domainCopy the code

Add the blocking route to the test server as follows:

192.168.33.13 route add-net 192.168.1.0 netmask 255.255.255.0 GW 192.168.33.12Copy the code

This route is equivalent to routing the packets destined for network segment 192.168.1.0 through the gateway 192.168.33.12, and intercepts the packets returned by the test server.

This is the entire architectural deployment of TcpCopy.

Envelope flow analysis

Let’s capture packets and see how they flow in the process.

Tcpcopy client 192.168.33.11 and test 192.168.33.13 using Python -m SimpleHTTPServer 8000 port service, respectively. Send a request from my host 192.168.33.1 to capture packets on three machines.

Tcpcopy Client 192.168.33.11 The packet information is as follows:

The red block shows the normal exchange of requests between my machine (192.168.33.1) and tcpCopy Client machine (192.168.33.11), from three-way handshake, to HTTP request, to the final disconnection.

The blue block is the traffic copied by the tcpCopy. You can see that to make the intercepter intercept the packet return traffic, the tcpCopy has replaced the source IP address of the packet with the IP address of the pseudo network segment (192.168.1.0) specified by us. In this way, when the packet is returned, The return packet is directed to the secondary intercept server based on the route on the test server, avoiding impact on production traffic. This is also why there is no packet return for replication traffic with three handshakes and HTTP.

Test server 192.168.33.13:

The packet on the test server, like a normal traffic packet, shakes hands with the HTTP request three times before disconnecting. The source IP address that interacts with the test server 192.168.33.13 has been replaced with the pseudo IP address 192.168.1.1 by tcpCopy.

Intercept 192.168.33.12:

It can be seen that the request intercepted by the secondary server, labeled block 1 is the reply packet in the three-way handshake of replication traffic, and labeled block 2 is the reply packet of HTTP request, which is the interception function of Intercept. After block 1 and 2 are marked, the secondary server (192.168.33.12) and tcpCopy server (192.168.33.11) exchange data. This part is the TCP processing function of Intercept. It returns useful information to tcpCopy so that the TCP connection between tcpCopy and the test machine can complete.

According to the packet capture above, we obtained the packet flow process similar to the architecture diagram, which is summarized as follows:

  • Production traffic requests are normal, and services respond properly.
  • The tcpCopy service replicates the traffic on the production machine, changes the source IP address of the traffic packet to the pseudo network segment we specify (specified by the -c parameter), and forwards the traffic to the test server.
  • The test server receives traffic, but the source IP address of the packet is an IP address in the pseudo network segment. When the packet is returned, the traffic is diverted to the secondary server according to the pseudo route configured in advance.
  • The secondary server receives a packet from the test server, but does not forward it. Instead, the packet is unpacked and only some of the necessary information is returned to tcpCopy to complete the TCP interaction between tcpCopy and the test server.

According to the official documentation, there are a few other things to note:

  • Secondary servers do not forward packets and need to turn off kernel parametersip_forward
  • During testing, filter upstream traffic and isolate test data sources to prevent multiple operations on production data.
  • Tcpcopy also supports offline replication. For details, please refer to the documentation.
  • The auxiliary machine needs to be in the same network segment as the test machine, so that the auxiliary machine can be used as the gateway of the pseudo network segment. This restriction can be removed by adding a proxy. If you use Nginx as a test transfer and add pseudo routes to the nginx server, the test machine only needs to register with Nginx and no other configuration is required.

The Goreplay scheme is implemented

Goreplay profile

Goreplay is another popular open source tool for traffic replication. It has a simpler architecture than tcpCopy, with only one GOR component, as follows:

You just need to start a GOR process on the production server, which does all the work including listening, filtering, and forwarding. Its design follows the Unix design philosophy: everything is made up of pipes, and various inputs reuse data into outputs.

Input and output are commonly referred to as plug-ins, of which the following are common.

Available inputs:

  • — Input-raw To capture HTTP traffic, you should specify the IP address or interface and application port.
  • –input-file Indicates the file that receives traffic output (–output-file), which is used for offline traffic replay.
  • — Input-tcp Used by Gor aggregation instances if you decide to forward traffic from multiple forwarder Gor instances to it.

Available output:

  • –output-http Replays HTTP traffic to a given endpoint, accepting the underlying URL.
  • –output-file Records incoming traffic to a file. More about saving and replaying from files
  • –output-tcp forwards incoming data to another Gor instance and uses –input-tcp with it.
  • –output-stdout For debugging, output all data to stdout.

You can speed, filter, and re-process data. You can also reuse middleware to implement custom logic processing, such as filtering and authentication of private data.

Other common parameters:

  • – the output – HTTP “staging.com | 10” output flow rate of 10%
  • –http-allow-method Filtering based on the request mode.
  • –http-allow-url Specifies the URL whitelist. Other requests will be discarded.
  • –http-disallow-url If a url is the opposite, blacklist, other requests will be captured.

This article does not do too much description of middleware, only discuss common functions, middleware requirements can refer to the middleware documentation.

Goreplay structures,

Goreplay was developed using Golang. We can either directly use the binaries compiled for each system, or we can compile ourselves. We used binaries directly here.

Wget https://github.com/buger/goreplay/releases/download/v1.3.0_RC1/gor_1.3_RC1_x64.tar.gz tar ZXVF Gor_1.3_rc1_x64.tar. gz # Extract binary gor gorCopy the code

Next, start GOR directly to replicate the traffic and forward.

Sudo. / gor - input - raw: 8000 - output - HTTP = "http://192.168.33.13:8001"Copy the code

Copy of the local port 8000 flow distal to the HTTP service http://192.168.33.13:8001. (Duplicate traffic on the same port. This is a gor bug that can still be reproduced in version 1.3, see issue292)

The traffic forwarded by Goreplay is not directly forwarded by TCP packets, but reorganizes HTTP requests and sends them to the test server. So it is the interaction between the new GOR thread and the test server, independent of the listening thread, so there is no need to intercept traffic.

Packet flow analysis

Let’s take a look at the flow direction process of the traffic packets copied by GOR:

The red block indicates normal traffic, and the blue block indicates replication traffic.

Seeing this, you might be wondering, why doesn’t GOR block traffic?

When the production machine and the test machine are connected, tcpCopy changes the source IP address of the TCP packet, but the port is still used to request the client. It is the TCP data link level traffic replication. Gor, on the other hand, is not strictly copying, but reconstructing the HTTP request. The new port is used to connect to the test machine. When the test machine sends back packets, even if the packets are sent back to the production machine, the production flow will not be affected because the packets are on different ports from the client.

conclusion

At this point, we have some basic concepts and applications for traffic replication, as well as tcpCopy and Goreplay, two open source tools. Both open source tools have their own strengths and weaknesses, so let’s summarize them together.

  • While the TCPCopy deployment architecture is relatively complex, goreplay is relatively simple and requires only one process to start.
  • Tcpcopy supports a variety of protocols, while Goreplay supports only HTTP.
  • Both tcpCopy and GoReplay support offline and online recording playback.

Simple HTTP replication goreplay can do the job, but tcpCopy is recommended for more complex scenarios. More complex, higher traffic replication requirements, that can only be customized by ourselves.

Well, that’s the end of this piece, feel free to leave a comment on what you think is the best way to replicate traffic.

I’m DeanWu, a person trying to be a real SRE.

Follow the public account “Mr. Wu code nong”, you can get the latest articles in the first time. Reply keyword “go” “python” to get the learning materials I collected, also can reply keyword “small two”, add my WX to pull you into the technical exchange group, chat about technology chat about life ~