Learn how Whistle works and how to implement a simple packet capture debugging tool.

Github address: github.com/avwo/whistl…

Whistle is a cross-platform Web packet capture and debugging (HTTP) agent based on Node.js.

  1. Real-time packet capture: Supports packet capture for common Web requests such as HTTP, HTTPS, HTTP2, WebSocket, and TCP.
  2. Modify request response: Unlike the breakpoint used by common packet capture debugging tools, Whistle uses the configuration rule method similar to system host.
  3. Extension: Support to write plug-ins through Node, or as an independent NPM package to introduce the project two ways of extension.

This article walks you through the Whistle functionality step by step, starting with the basics:

  1. What is an HTTP proxy
  2. Implement a simple HTTP proxy
  3. Complete HTTP Proxy Architecture (Whistle)
  4. Specific implementation principle
  5. The resources

1. What is HTTP proxy

A proxy is a client-to-server transfer service, where:

  1. Proxy-free: After a direct connection is established between the client and server, data can be exchanged.
  2. Request by proxy: The client does not directly establish a connection with the Server. Instead, after establishing a connection with the proxy, the client sends the address of the target Server to the proxy, and then establishes a connection with the Server through the proxy. If the proxy service is an HTTP Server, it is called an HTTP proxy.

Let’s look at how the client passes the destination server address to the HTTP proxy and how the HTTP proxy establishes a connection to the destination server.

2. Implement simple HTTP proxy

Let’s start with the simplest HTTP proxy implemented in Node.js:

const http = require('http'); const { connect } = require('net'); / * * * * * * * * * * * * * * * * means * * * * * * * * * * * * * * * * * * * * / const getHostPort = (host, defaultPort) = > {let port = defaultPort | | 80; const index = host.indexOf(':'); if (index ! == -1) { port = host.substring(index + 1); host = host.substring(0, index); } return {host, port}; }; Const getOptions = (req, defaultPort) => { See Whistle const {host, port} = getHostPort(req.headers. Host, defaultPort) for the complete implementation; Return {hostname: host, / / specified request the domain name, is used to obtained by DNS server IP and port setting request head host field, / / server port specified path: the req. Url | | '/', method: Req. method, headers: req.headers, rejectUnauthorized: false, // HTTP request will be automatically ignored}; }; Whistle const handleClose = (req, res) => {const destroy = (err) => {Whistle const handleClose = (req, res) => { Req.destroy (); res && res.destroy(); }; res && res.on('error', destroy); req.on('error', destroy); req.once('close', destroy); }; / * * * * * * * * * * * * * * * * * * service code * * * * * * * * * * * * * * * * * * / const = HTTP server. The createServer (); On ('request', (req, res) => {// Establish a connection with the server, Const client = http.request(getOptions(req), (svrRes) => {res.writehead (svrres.statuscode, svrRes.headers); svrRes.pipe(res); }); req.pipe(client); handleClose(res, client); }); // Tunnel agent: Server. on('connect', (req, socket) => {// Establish connection with server, Const client = connect(getHostPort(req.url), const client = connect(getHostPort(req.url), () => {socket.write('HTTP/1.1 200 Connection Established\r\n\r\n'); socket.pipe(client).pipe(socket); }); handleClose(socket, client); }); server.listen(8080);Copy the code

The above code implements an HTTP proxy with the function of forwarding requests. It can be seen from the code that the HTTP proxy is an ordinary HTTP Server, and listens for two events, request and connect. The client will pass the address of the target Server through these two events.

  1. request: Normal HTTP passes the address of the target server through this event.
  2. connect: Non-HTTP requests, such as HTTPS, HTTP/2, WebSocket, and TCP, use this event to send the address of the target server. The proxy request that triggers this event is also calledTunnel proxy.

Req. url or req.headers. Host can obtain the address of the target server (host:port) in the event, establish a connection with the address of the server, and send the result back to the client by way of HTTP response. In addition to request forwarding, complete HTTP should have at least:

  1. View real-time packet capture;
  2. Parse HTTPS requests;
  3. Modify the request response content.
  4. Extended functionality.

Using Whistle as an example, we’ll see how to implement a complete HTTP proxy using Node.js.

3. Complete HTTP proxy Architecture (Whistle)

It is mainly divided into five modules:

  1. Request access module
  2. Tunnel agent module
  3. Processing HTTP request module
  4. Rule management module
  5. Plug-in management module

4. Specific implementation principle

Let’s take a look at how these five modules are implemented.

4.1 Requesting access Modules

All requests go through the request access module first. Whistle supports four request access modes:

  1. HTTP & HTTPS direct request: forwards the request to Whistle by configuring hosts or DNS.
  2. HTTP proxy: The default Whistle access mode is configured with a system proxy or HTTP proxy through a browser plug-in.
  3. HTTPS proxy: encrypts proxy requests on the HTTP proxy, that is, the HTTPS Server. You can use a specified certificate to transfer the requests to the HTTP proxy.
  4. Socks5 proxy: use NPM packagesocksv5Converts the TCP request to a normal TCP request, and converts the TCP request to a tunnel proxy request.

The basic implementation principle is: all requests are converted into TUNNEL proxy requests or HTTP requests of HTTP proxy, and then tunnel proxy requests are parsed into HTTP requests.

How to convert an ordinary TCP request into a tunnel proxy request? For details, see black-proxy

Let’s look at how HTTP requests can be resolved from tunnel proxy requests.

4.2 Tunnel Agent Module

Key points (HTTP requests can also go through tunnel proxy) :

  1. The matching global rules determine whether to resolve the tunnel proxy request. If no, the tunnel proxy request is treated as a common TCP request.

  2. Socket.once (‘data’, handler) reads the first frame of the request point if necessary;

  3. Will be the first frame data into a string, through regular / ^ (\ w +) \ s + (\ s +) HTTP \ \ s + / 1 \ d $/ mi is a HTTP request? If it is an HTTP request, check whether it is a CONNECT request, that is, a TUNNEL proxy request (a tunnel proxy request can also be a tunnel proxy request). If yes, the request is forwarded to the tunnel proxy method for processing. If no, the request is forwarded to the HTTP request module for processing.

  4. If it is not an HTTP request, it is treated as an HTTPS request. In this case, a middleman is used to convert an HTTPS request into an HTTP request.

  5. Whistle first gets the requested certificates in the following order:

    • Through the matching plug-in (can be through the rulesniCallback://pluginSpecify the plug-in to load the certificate);
    • By boot parameter-z certDirSpecify a directory or~/.WhistleAppData/custom_certsLoaded custom certificate;
    • If you don’t have either of these automatic certificates, Whistle automatically generates a default certificate.
  6. After the certificate is obtained, an HTTPS Server is started using the certificate to convert HTTPS requests into HTTP requests and submit them to the HTTP request module for processing.

4.3 HTTP request processing module

HTTP request processing can be divided into two phases:

  1. Request stage:

    • Match global rules;
    • If there is a rule similar to whistle. XXX, execute the corresponding plug-in hook to retrieve the plug-in rule and merge it with the matching global rule.
    • Rules are executed, states are logged, and requests are made to the specified service.
  2. Response stage:

    • Execute the matching plug-in hook to get the plug-in rule and merge it with the matching global rule.
    • Execute the rule, record the status, and request a return to the client.

4.4 Rule Management

Unlike traditional packet capture debugging agents that modify the request response data using breakpoints, Whistle modifies the request response data using configuration rules. The advantages of this method are simple operation, persistent storage and sharing of the operation. Let’s take a look at some examples:

Whistle’s rule management has two main functions:

  1. Parsing rules
  2. Match rule

Parsing rules

Whistle has two types of rules:

  1. Global rules (common rules), rules that all requests will try to match, consisting of the following rules:

    • Rules Indicates the Rules configured.

    • Plugin root directory rules.txt configuration file;

      Documents: github.com/whistle-plu…

    • The remote rules introduced by the interface or plug-in rules.txt via @URL (a single line, Whistle updates the remote rules periodically).

  2. Plug-in rules (private rules) are rules that will only be matched by requests to the plug-in that match the whistle.xxx protocol in the global rules. They are made up of the following rules:

    • Plugins such as reqRulesServer dynamically return hooks;

    • Static rules configured in the plugin root directory _rules.txt.

    Documents: wproxy.org/whistle/plu…

Match rule

The complete structure of the Whistle rule is:

Documents: wproxy.org/whistle/mod…

4.5 Plug-in Management

The Whistle plugin has many functions. It not only has all the power of Node, but also can operate all the rules of a Whistle. It is used to do the following things:

  1. Authentication function
  2. Provides a UI interface
  3. Act as a request Server (respond directly or forward and modify the request response)
  4. Statistics request information (view report/log data, etc.)
  5. Set rules (dynamic, static, global, and private rules)
  6. Obtain captured packet data
  7. Codec request response data stream (Pipe Stream feature)
  8. Extended interface right-click menu (e.g., sharing captured packet data)
  9. Save and synchronize Rules & Values data
  10. Custom HTTPS request certificate

Such as:

  1. Whistle. script: Implements the dynamic setting of rules through custom scripts
  2. Whistle. vase: Provides flexible and powerful mock capabilities
  3. Whistle. inspect: Easy to quickly inject vConsole, Eruda and other page debugging tools
  4. Whistle. sni-callback: Custom certificate plug-in

For other examples of plug-ins, see: github.com/whistle-plu…

How does Whistle function as a plug-in? The following three design principles are followed:

  1. Completeness:

    Ensure that all function points are extensible, such as request authentication, certificate generation, packet capture, rule setting, and request processing.

  2. Stability:

    Internal exceptions do not affect other functions. Each Whistle plug-in is an independent process, and the plug-in interacts with Whistle through HTTP.

    Whistle uses the NPM package pfork to start the plugin process. The switch between processes is implemented directly through the HTTP module of Node.

  3. Ease of use:

    Convenient for user development and use.

  4. ** Simple structure (NPM package) + scaffolding lack

    ** Use: ** Install NPM package, use the same as the built-in protocol, and built-in interactive interface.

For more details on plug-ins, see: wproxy.org/whistle/plu…

In fact, Whistle supports plug-in extensions and can also be used as a standalone module in projects; In addition to local development, Whistle can also be used to develop multi-player development and coordination tools. For example, the implementation principle of Whistle will be introduced later.

  1. Whistle based on the implementation of multi-person multi-environment remote packet capture debugging tool

    Nohost:github.com/Tencent/noh…

  2. TDE is a distributed remote packet capture debugging tool based on Whistle and Nohost

    TDE is currently only used in Tencent, and will gradually be open source.

5. Reference materials

  1. Github Repository: github.com/avwo/whistl…
  2. Official plugin repository: github.com/whistle-plu…
  3. Detailed documentation: wproxy.org/whistle/