Jaeger

Jaeger is a link tracing tool based on the OpenTracing specification. For details about OpenTracing, please refer to the Official website of OpenTracing. Simply speaking, Opentracing is a set of standards, which provides basic link tracing units and interfaces.

Basic units can be understood as trace and span. Span is an operation expression and trace is a complete link expression, for example: A call on the back-end interface can be understood as a trace, and the service of MySQL can be called on the back-end interface. The calling process of the MySQL service can be regarded as a SPAN.

Interfaces can be understood as interfaces specified by OpenTracing, such as Inject Extract methods, etc.

If a tracing tool implements the corresponding interface according to the basic unit of OpenTracing, it can be interpreted as conforming to the OpenTracing specification. Jaeger is a tracing tool that conforms to the opentracing specification.

Client

Jaeger provides clients for various languages, such as go, python, c++, Java, etc. Client can be interpreted as a tool that generates trace and span and sends it to the downstream receiving module. Generating span and initializing trace are easy. Just call the package provided by Jaeger and use a function to create a new trace. Information such as Hostman and IP.

There are many online demos of jaeger in various languages, so I won’t go into details.

How does the client send data to the downstream receiving module

1. First, when you call the spp. finish method, the client wraps the finish span into a format that can be sent

2. The queue is placed in a queue. The implementation of the queue varies slightly from language to language.

3. The queue has a processing function that takes the data from the queue and puts it into a Batch span. When the Batch span is full or the time is up (this time can be set), the batch span is sent to the downstream, which can be agent or collector. There are a number of udp protocol options available from HTTP GRPC, and it is easy to extend other sampling methods, such as reusing log collection agents, Prometheus agents, and so on.

Agent

The agent, as its name implies, receives data generated by the client. The Agent and the client use UDP long link to transmit data. An agent is simply an agent that collects data and sends it to a real collector, the Collector, similar to a client.

If the application is running on a physical machine, it is necessary to deploy the Agent. If the application is running on a cloud, the client needs to pass data directly to the Collector.

At the same time, the Agent receives information about sampling from the Collector, if deployed

Collector

The collector is used to write data to Kafka. The logic is simple. Start a server, provide various ports and protocols, and accept data to kafka.

ingester

Jaeger chose es as the index of elasticSearch (es) because it is easy to query (reverse indexing) and has good performance. Es is a storage engine in ELK (full link acquisition/storage/visualization, used extensively in the industry).

Ingester natively writes es to indexes that are time-terminated, making it easier to use time to retrieve them in the front end.

Query

Query is a module on the back end of jaeger that provides an interface for the front end search. / API /service/API /search are provided interfaces for the front end query.

The sampling strategy

Because, in fact, large-scale system, the use of jaeger for screening questions or demand monitoring alarm, need only part of the data can be, if put all the data in May cause storage pressure is large, and because the service itself may find it necessary to send data performance problems, so the jaeger provides a sampling strategy to choose from.

  • Probability sampling
  • Adaptive sampling
inject extract

Inject and extract is the key for Jaeger to achieve cross-process tracking. The implementation of Inject and extract is also very simple, which can be regarded as serialization and deserialization functions. Since there are serialization and deserialization, it needs data carrier. In Jaeger, HTTP header is a data carrier. If two services are called using HTTP interface, they can serialize the data of the current SPANContext into the header and then pass it to the downstream service. When receiving the request, they can use the header to parse it. Then get the information and create a span. The newly created span is the content of the same trace, which can be checked in the front interface query.

This is the end of the current record, because the work needs to use and transform Jaeger, so it is simply recorded, if there are many people watching later, a detailed code analysis of each module of Jaeger will be published.