The author | Wang Xi ning Alibaba senior technical experts
Participate in the “Alibaba Cloud native” public account at the end of the message interaction, that is, the opportunity to get the book benefits!
This article is extracted from the book “Istio Service Grid Technology Analysis and Practice” written by Wang Xining, a senior technical expert of Ali Cloud. Starting with basic concepts, this article introduces what is service grid and Istio, and systematically and comprehensively introduces the relevant knowledge of Istio service grid in view of the three development trends of service grid in 2020. You only need to be happy to participate in the interaction at the end of the article, we are responsible for the bill! The essential book Istio Service Grid technology analysis and Practice is free
Istio is an open source service grid that provides the basic operational and management elements needed for a distributed microservice architecture. As organizations increasingly adopt cloud platforms, developers must use microservices design architectures for portability, and operations personnel must manage large distributed applications with both hybrid and multi-cloud deployments. Istio takes a consistent approach to securing, connecting, and monitoring microservices, reducing the complexity of managing microservice deployments.
In terms of architectural design, Istio service grid is logically divided into two parts: control plane and data plane. The control plane Pilot manages and configures the proxy to route traffic, and configures the Mixer to enforce policies and collect telemetry data. The data plane consists of a set of intelligent agents deployed as Sidecars that mediate and control all network traffic between microservices and mixers.
As a proxy, Envoy is well suited to service grid scenarios, but to get its maximum value, it needs to work well with the underlying infrastructure or components. The Envoy forms the data plane of the service grid, and Istio provides supporting components that create the control plane.
On the one hand, we saw in Envoy that you can configure a set of service proxies using static profiles or using a set of discovery services to discover listeners, endpoints, and clusters at run time. Istio implements the xDS API for these Envoy proxies in Pilot.
On the other hand, an Envoy’s service discovery relies on some kind of service registry to discover service endpoints. Istio Pilot implements this API, but also abstracts Envoy from any particular service registration implementation. When Istio is deployed on Kubernetes, the Kubernetes service registry is what Istio uses for service discovery. Other registries can also be used like Consul of HashiCorp. The Envoy data plane is completely unaffected by these implementation details.
In addition, an Envoy agent can send a lot of metrics and telemetry data, and where this telemetry data is sent depends on the Envoy configuration. Istio provides telemetry receiver Mixers as part of its control plane, and Envoy agents can send this data to the Mixers. Envoy also sends distributed trace data to an Open Tracing engine (following the Open Tracing API). Istio can support a compatible open tracking engine and configure an Envoy to send its tracking data to that location.
Analyze the Istio control plane
Istio’s control plane and Envoy’s data plane together form a compelling service grid implementation. Both have thriving and vibrant communities and are geared towards next-generation service architectures. Istio is platform-independent and can run in a variety of environments, including cross-cloud, internal deployment, Kubernetes, Mesos, and more. You can deploy Istio on Kubernetes or Nomad with Consul. Istio currently supports services deployed on Kubernetes, services registered with Consul, and services deployed on virtual machines.
The control plane includes Pilot, Mixer, Citadel and Galley. See Istio architecture figure.
1. Pilot
Istio’s Pilot component is used to manage traffic and control the flow of traffic and API calls between services to better understand traffic so that problems can be found before they occur. This makes calls more reliable, the network more robust, and the application rock-solid even in the face of adverse conditions. With Istio Pilot, you can configure service-level properties such as fuses, timeouts, and retries, and set up common sequential deployment tasks such as Canary publishing, A/B testing, and phased publishing based on percentage split traffic. Pilot provides service discovery capabilities for Envoy proxies, traffic management capabilities for intelligent routing and resilient capabilities such as timeouts, retries, fuses, and more. Pilot translates high-level routing rules that control traffic behavior into Envoy proxy-specific configurations and propagates them to Envoy at run time. In addition, Istio provides powerful out-of-the-box failover capabilities, including timeouts, retry mechanisms that support time-out budgeting and variable jitter, concurrent connections and request limits to upstream services, periodic active health checks on each member of the load balancing pool, and passive health checks.
Pilot abstracts platform-specific service discovery mechanisms and synthesizes them into a standard format that can be used by any Sidecar that conforms to the data plane API. This loose coupling enables Istio to run in a variety of environments (such as Kubernetes, Consul, Nomad) while maintaining the same operational interface for traffic management.
2. Mixer
Istio’s Mixer components provide policy control and telemetry collection capabilities, isolating the rest of Istio from the implementation details of the various back-end infrastructure backends. Mixer is a platform-independent component that performs access control and usage policies on the service grid and collects telemetry data from Envoy agents and other services. The broker extracts request-level attributes and sends them to Mixer for evaluation.
Mixer includes a flexible plug-in model that enables it to plug into a variety of host environments and back-end infrastructures, abstracting Envoy proxies and ISTIO-managed services from these details. With Mixer, you can fine-tune all interactions between the grid and the back end of your back-end infrastructure.
Unlike Sidecar agents, which must save memory, Mixer runs independently, so it can use a fairly large cache and output buffer to act as a highly scalable and highly available secondary cache for Sidecar.
Mixer is designed to provide high availability for each instance. Its local caches and buffers reduce latency and also help shield the back-end infrastructure from back-end failures, even when the back-end is not responding.
3. Citadel
Istio Citadel security features powerful authentication, powerful policies, transparent TLS encryption, and authentication, authorization, and audit (AAA) tools for protecting services and data that Envoy can terminate or initiate TLS traffic to services in a grid. To do this, Citadel needs to support the creation, signing, and rotation of certificates. Istio Citadel provides application-specific certificates that can be used to establish two-way TLS to protect traffic between services.
With Istio Citadel, ensure that services containing sensitive data can only be accessed from strictly authenticated and authorized clients. Citadel provides powerful inter-service and end-user authentication with built-in identity and credential management. It can be used to upgrade unencrypted traffic in the service grid and provide operations personnel with the ability to enforce policies based on service identification rather than network control. Istio’s configuration policy configures platform authentication on the server side, but does not enforce it on the client side, and allows you to specify authentication requirements for services. Istio’s key management system automatically generates, distributes, rotates, and revokes keys and certificates.
Istio RBAC provides access control at the namespace level, service level, and method level for services in the Istio grid, including easy-to-use role-based semantics, service-to-service, and end-user to service authorization, and flexible support for custom properties for roles and role bindings.
Istio can enhance the security of microservices and their communications, both service-to-service and end-user to service, without changing the service code. It provides a powerful role-based identity mechanism for each service to interact across clusters and clouds.
4. Galley
Galley is used to validate user-written Istio API configurations. Over time, Galley will take over Istio’s top-level responsibility for obtaining configuration, processing, and allocation of components. It is responsible for isolating other Istio components from the details of getting user configuration from the underlying platform, such as Kubernetes.
All in all, with Pilot, Istio helps you simplify traffic management as you scale up your deployment. Through Mixer, problems can be quickly and efficiently detected and fixed with robust and easy-to-use monitoring capabilities. With Citadel, the security burden is lightened, allowing developers to focus on other critical tasks.
There are several key goals in Istio’s architectural design that are critical to the system’s ability to handle large volumes of traffic and high-performance service processing.
-
** Maximizes transparency: ** To adopt Istio, operational and peacekeeping developers should be able to derive real value from it at little cost. To do this, Istio automatically injects itself into all network paths between services. Istio uses Envoy proxies to capture traffic and, where possible, automatically programs the network layer to route traffic through these proxies without making much, if any, changes to the deployed application code. In Kubernetes, Envoy proxies are injected into pods to capture traffic via iptables rules. Once injected into the POD and the routing rules are modified, Istio can regulate all traffic. This principle also applies to performance. When Istio is used for deployment, operations personnel can find that the added resource overhead to provide these capabilities is minimal. All components and apis must be designed with performance and scale in mind.
-
** Scalability: ** As operations and developers rely more and more on the capabilities Istio provides, the system will inevitably grow along with their needs. As we continue to add new capabilities, what is most needed is the ability to extend the policy system, integrate other policy and control sources, and propagate grid behavior signals to other systems for analysis. The policy runtime supports standard extension mechanisms for insertion into other services. In addition, it allows the vocabulary to be extended to allow policies to be enforced based on new signals generated by the grid.
-
** Portability: ** The ecosystem that uses Istio is different in many ways. Istio must be able to run in any cloud or on-premise environment with minimal cost. Migrating ISTIo-based services to a new environment should be a breeze, but it is also possible to deploy a service to multiple environments simultaneously using Istio, for example on a hybrid cloud for redundant disaster recovery.
-
** Policy consistency: ** policies are applied to API calls between services to control grid behavior. But it is equally important to apply policies to resources that do not need to be expressed at the API level. For example, it is more useful to apply quotas to the number of cpus consumed by a machine learning training task than to the call that starts the job. Therefore, Istio maintains the policy system as a unique service with its own API, rather than putting it into a broker, which allows services to integrate directly with it as needed.
Analyze the Istio data plane
When the concept of a service grid was introduced, the concept of a service broker was mentioned and how to use the broker to build a service grid to mediate and control all network traffic between microservices. Istio uses Envoy proxies as the default out-of-the-box service proxies that run with all application instances participating in the service grid, but not in the same container process, forming the data plane of the service grid. Whenever an application wants to communicate with another service, it does so through the service proxy Envoy. Thus, Envoy proxies are a key part of the data plane and the overall service grid architecture.
1. The Envoy agent
Envoy was originally developed by Lyft to solve some of the complex networking problems that arise when building distributed systems. It was offered as an open source project in September 2016 and joined the Cloud Native Computing Foundation (CNCF) a year later. Envoy, implemented in C++, has high performance and, more importantly, is very stable and reliable when running under high loads. The network should be transparent to the application, and when problems occur with the network and application, it should be easy to determine the root cause of the problem. It is based on this design philosophy that Envoy is designed as a seven-tier proxy and communication bus for a service-oriented architecture.
To better understand envoys, we need to understand a few basic terms related to them:
- ** Out of Process architecture: The **Envoy is an independent Process that forms a transparent communication grid between envoys, with each application sending messages to or receiving messages from a local host, regardless of network topology.
- Single-process multithreaded model: The single-process multithreaded architecture model is used by the **Envoy. A main thread manages trivial tasks, while worker subthreads perform listening, filtering, and forwarding functions.
- The host that connects to the Envoy and sends the request and receives the response is called the Downstream host, which means that the Downstream host represents the host that sent the request.
- Upstream: As opposed to downstream, the host receiving the request is called the Upstream host.
- ** listeners: ** listeners are named network addresses, including ports, Unix domain sockets, etc., that can be connected by downstream hosts. Envoy exposes one or more listeners to the downstream host for connection. Each listener is individually configured with some network level (that is, layer 3 or 4) filters. When the listener receives a new connection, the configured local filter is instantiated and begins processing subsequent events. Generally speaking, the listener architecture is used to perform most different broker tasks, such as speed limiting, TLS client authentication, HTTP connection management, MongoDB sniffing? Ing, raw TCP proxy, etc.
- Clusters: A Cluster is a set of logically identical upstream hosts connected by an Envoy.
- * * xDS protocol: ** xDS protocol represents several Discovery Service protocols in Envoy, including Cluster Discovery Service (CDS), Listener Discovery Service (LDS), Listener Discovery Service, Route Discovery Service (RDS), Endpoint Discovery Service (EDS), and key Discovery Service (SDS, Secret Discovery Service).
Envoy proxies have a number of features that can be used for inter-service communication, such as exposing one or more listeners to downstream hosts for connections and exposing external applications via ports; Routing rules are defined to handle traffic transmitted in listeners, direct that traffic to the target cluster, and so on. The roles and roles of these discovery services in Istio will be further examined in subsequent chapters.
Now that you know about Envoy’s terms, you might want to know as quickly as possible what exactly does Envoy do?
First, Envoy is a proxy that acts as an intermediary in a network architecture and can add additional capabilities to manage traffic on a network, such as providing security, privacy protection, or policy. In the scenario of inter-service invocation, the proxy can hide the topological details of the service back end for the client, simplifying the complexity of the interaction, and protecting the back-end service from overload. For example, a back-end service is actually a set of identical instances running, each capable of handling a certain amount of load.
Second, a Cluster in an Envoy is essentially a logically identical set of upstream hosts that an Envoy is connected to. So how does the client know which instance or IP address to use when interacting with the back-end service? An Envoy acts as a proxy for routing. Using SDS (Service Discovery Service), an Envoy finds all the members of a cluster, performs active health checks to determine the health status of the cluster members, and, based on their health status, Load balancing policies determine which cluster members to route requests to. In the case of an Envoy proxy handling load balancing across service instances, the client does not need to know any details of the actual deployment.
2. Envoy launch configuration
Envoy currently offers two versions of the API, v1 and v2. The V2 API has been available since Envoy 1.5.0. In order to enable users to migrate smoothly to the V2 VERSION API, Envoy launches with a parameter called v2-conf? Ig – only. With this parameter, you can explicitly specify the protocol Envoy uses the V2 API. Fortunately, the V2 API is a superset of V1, compatible with the V1 API. In the current release of Istio 1.0, its V2-supporting API is explicitly specified. By looking at the container start command using the Envoy as the Sidecar proxy, you can see similar start parameters as follows, specifying the parameter –v2-conf? Ig – only:
$ /usr/local/bin/envoy -c
/etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45
--parent-shutdown-time-s 60 --service-cluster ratings --service-node
Sidecars ~ 172.33.14.2 ~ ratings - v1-8558 d4458d - ld8x9. Default to default. SVC. Cluster. The local
--max-obj-name-len 189 --allow-unknown-fields -l warn --v2-config-only
Copy the code
The -c parameter indicates the path of the boot configuration file based on version V2. The format is JSON. Other formats such as YAML and Proto3 are supported. It is first parsed as the boot configuration file of version V2. If the parsing fails, the [–v2-conf? Ig -only] option determines whether to parsed as the JSON configuration file of version v1. The other parameters are explained below to help readers understand the configuration information when the Envoy agent starts:
- Restart-epoch indicates the hot restart period, which defaults to 0 for the first startup and should be increased after each hot restart.
- Service-cluster defines the name of the local service cluster to run in the Envoy.
- Service-node defines the local service node name for the Envoy to run.
- Drain-time-s indicates the time (in seconds) at which the Envoy will exhaust the connection during a hot restart. Default is 600 seconds (10 minutes). Normally, the exhaust time should be less than the parent process shutdown time set with the –parent-shutdown-time-s option.
- Parent-shutdown-time-s indicates the time (in seconds) the Envoy waits before closing the parent process on a hot restart.
- Max-obj-name-len describes cluster cluster and route configuration route_conf? The maximum length, in bytes, of the name field in ig and listener. This option is typically used in scenarios where cluster names are automatically generated, often exceeding the internal limit of 60 characters. The default value is 60.
- Envoy launch profiles come in two ways: static configuration and dynamic configuration. Specific performance is as follows:
- Static configuration puts all the information in a configuration file that is loaded directly at startup.
- Dynamic configuration requires providing an Envoy server that dynamically generates the service discovery interface (xDS) required by the Envoy. Istio implements v2’s xDS API to dynamically adjust configuration information by discovering services.
3. Envoy static and dynamic configuration
Envoy is an intelligent proxy driven by a CONFIGURATION file in JSON or YAML format. Users who are already familiar with Envoy or Envoy configuration should already know that there are different versions of Envoy configuration. The original version v1 was the primitive way to configure an Envoy when it started. This version has been deprecated to support version V2 of the Envoy configuration. The Envoy’s reference document (www.envoyproxy.io/docs) also provides documentation that clearly distinguishes V1 from v2. This article will focus only on the V2 configuration, as it is the latest version and the one used by Istio.
The Envoy version V2 configuration API is built on TOP of gRPC, and an important feature of the V2 API is the ability to reduce the time required for Envoy proxy aggregation configurations by leveraging streaming capabilities when calling the API. In effect, this eliminates the downside of the polling API, allowing the server to push updates to Envoy proxies instead of polling them periodically.
Envoy’s architecture makes it possible to use different types of configuration management approaches. The approach adopted in deployment will depend on the requirements of the implementer. Simple deployments can be implemented with fully static configurations, and more complex deployments can incrementally add more complex dynamic configurations. It is mainly divided into the following situations:
- All-static: In a all-static configuration, the implementer provides a set of listener and filter chains, clusters, and optional HTTP routing configurations. Dynamic host discovery can only be discovered using dnS-BASED services. Configuration reloading must be done through the built-in hot restart mechanism.
- SDS/EDS only: This mechanism enables Envoy to discover members in an upstream cluster on a static configuration.
- SDS/EDS and CDS: Envoy can discover upstream clusters in use through this mechanism.
- SDS/EDS, CDS, and RDS: RDS can discover the entire routing configuration for HTTP connection manager filters at run time.
- SDS/EDS, CDS, RDS, and LDS: LDS can discover the entire listener at run time. This includes all filter stacks, including HTTP filters with applications embedded in RDS.
Static configuration
We can specify listeners, routing rules, and clusters using Envoy profiles. The following example provides a very simple Envoy configuration:
static_resources:
listeners:
- name: httpbin-demo
address:
socket_address: { address: 0.0. 0. 0. port_value: 15001 }
filter_chains:
- filters:
- name: envoy.http_connection_manager
config:
stat_prefix: egress_http
route_config:
name: httpbin_local_route
virtual_hosts:
- name: httpbin_local_service
domains: [" * "]
routes:
- match: { prefix: "/"
}
route:
auto_host_rewrite: true
cluster: httpbin_service
http_filters:
- name: envoy.router
clusters:
- name: httpbin_service
connect_timeout: 5s
type: LOGICAL_DNS
# Comment out the following line to test on v6 networks
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
hosts: [{ socket_address: { address: httpbin, port_value: 8000 }}]
Copy the code
In this simple Envoy configuration file, we declare a listener that opens a socket on port 15001 and appends a filter chain to it. The filter HTTP_connection_manager uses routing directives in the Envoy configuration (the simple routing directives seen in this example are wildcards that match all virtual hosts) and routes all traffic to the HTTPbin_service cluster. The last part of the configuration defines the connection properties for the HTTPbin_service cluster. In this example, we specify LOGICAL_DNS as the discovery type for the endpoint service and ROUND_ROBIN as the load balancing algorithm for communicating with the upstream Httpbin service.
This is a simple configuration file that creates incoming listener traffic and routes all traffic to the Httpbin cluster. It also specifies the Settings for the load balancing algorithm to use and the connection timeout configuration to use.
You’ll notice that a lot of the configuration is explicitly specified, such as which listeners are specified, what routing rules are specified, which clusters we can route to, etc. This is an example of a completely static configuration file.
More information about these parameters explanation, please refer to the Envoy of document (www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/service_discovery#logical-dns). In the previous section, we showed that Envoy can dynamically configure its various Settings. Here’s an overview of Envoy’s dynamic configuration and how Envoy uses the xDS API to do it.
Dynamic configuration
Envoy can leverage a set of apis to perform configuration updates without any downtime or restarts. Envoy requires only a simple bootstrap profile that points to the correct discovery service API for configuration, and dynamic configuration for the rest. The apis for dynamic configuration Envoy are commonly referred to as xDS services and include the following:
- ** Listener Discovery Service (LDS) : ** A mechanism that allows an Envoy to query an entire listener, dynamically adding, modifying, or deleting known listeners by calling this API; Each listener must have a unique name. If no name is provided, Envoy creates a UUID.
- ** Routing Discovery Service (RDS) : **Envoy’s mechanism for dynamically retrieving routing configurations, including HTTP header modifications, virtual hosts, and individual routing rules contained in each virtual host. Each HTTP connection manager can independently obtain its own routing configuration through the API. The RDS configuration is part of the Listener discovery Service LDS and is a subset of THE LDS that specifies when static and dynamic configurations should be used and which routes should be used.
- ** Cluster Discovery Service (CDS) : ** An optional API called by Envoy to dynamically retrieve cluster management members. Envoy will also coordinate cluster management based on API responses, adding, modifying, or deleting known clusters as needed. Any clusters statically defined in the Envoy configuration cannot be modified or deleted through the CDS API.
- ** Endpoint Discovery Service (EDS) : ** A mechanism that allows Envoy to fetch cluster members, based on the gRPC or RESTJSON API, which is a subset of CDS; Cluster members are called endpoints in Envoy terms. For each cluster, the Envoy retrieves the endpoint from the Discovery service. EDS is the preferred service discovery mechanism.
- ** Key Discovery Service (SDS) : ** API for certificate distribution; The most important benefit of SDS is the simplification of certificate management. Without this feature, in a Kubernetes deployment, the certificate must be created as a key and mounted into an Envoy proxy container. If the certificate expires, the key needs to be updated and the proxy container needs to be redeployed. Using the key discovery service SDS, the SDS server pushes the certificate to all Envoy instances. If the certificate expires, the server simply pushes the new certificate to the Envoy instance, which will immediately use the new certificate without redeployment.
- ** Aggregate Discovery Service (ADS) : ** Serialized streams of all changes to the other apis mentioned above; You can use this single API to get all the changes in order; ADS is not an xDS in the true sense of the word. It provides an aggregation function that can be done in a stream when multiple synchronous xDS access is required.
The configuration can use one or a combination of these services, not all of them. One thing to note is that Envoy’s xDS API is built on the assumption that the correct configuration will converge. For example, an Envoy might end up using a new route to retrieve an UPDATE to the RDS that routes traffic to a cluster that has not yet been updated in the CDS. This means that routing may introduce routing errors until CDS are updated. Envoy introduced the aggregation discovery service ADS to address this problem, while Istio implemented the aggregation discovery service ADS and used ADS to make proxy configuration changes.
For example, an Envoy agent to dynamically discover listeners can use the following configuration:
dynamic_resources:
lds_config:
api_config_source:
api_type: GRPC
grpc_services:
- envoy_grpc:
cluster_name: xds_cluster
clusters:
- name: xds_cluster
connect_timeout: 0.25s
type: STATIC
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
hosts: [{ socket_address: { address: 127.0. 03.. port_value: 5678 }}]
Copy the code
With the above configuration, we do not need to explicitly configure each listener in the configuration file. We told the Envoy to use the LDS API to find the correct listener configuration value at run time. However, we need to explicitly configure a cluster, which is where the LDS API resides, namely the cluster XDS_cluster defined in this example.
On the basis of static configuration, the information provided by each discovery service is presented intuitively.
On the basis of static configuration, the information provided by each discovery service is presented intuitively.
This article is excerpted from Istio Service Grid Analysis and Practice. This book is written by Ali Cloud senior technical expert Wang Xining, detailed introduction of the basic principles of Istio and development combat, including a large number of selected cases and reference code can be downloaded, can quickly start Istio development. Gartner believes that by 2020 service grid will be the standard technology for all leading container management systems. This book is suitable for all readers who are interested in microservices and cloud native, and we recommend you to read this book in depth.
Participate in the public message interaction, namely have the opportunity to obtain the following benefits!
“Alibaba Cloud originator focuses on micro-service, Serverless, container, Service Mesh and other technical fields, focuses on the trend of cloud native popular technology, large-scale implementation of cloud native practice, and becomes the public account that most understands cloud native developers.”