Address: kublr.com/blog/implem…

This article reprinted from: www.servicemesher.com/blog/hands-…

For more Service Mesh articles, visit www.servicemesher.com

This article is part 1 of this tutorial.

If you haven’t heard of Service Mesh before, don’t worry. While it is relatively new in terms of available documentation, public discussion, and Github activity, and similar to container-based and microservice architectures, it is not yet widely adopted, it will have a profound impact on software architecture. This article will help you understand the basics and tutorials of Service Mesh and how to implement it and benefit from the infrastructure.

The two main goals of the Service Mesh are to allow insight into previously unseen layers of Service communication and to gain complete control over communication logic between all microservices such as dynamic Service discovery, liability balancing, timeout, rollback, retry, circuit breakers, distributed call link tracing, and enforcement of security policies. For more details, see Istio Traffic Audit and distributed link tracing.

Kubernetes already has “Service Mesh” right out of the box. Its “Service” resource provides service discovery and load balancing of requests for the desired POD. Make a “service” work by configuring the management iptables rules on each host in the cluster, allowing only a polling load balancing approach, no retry or rollback logic, and nothing else that we might want to solve with a modern Service Mesh. However, implementing a full-featured Service Mesh system (Linkerd, Istio, or Conduit) in a cluster gives you the following possibilities:

  • Allows simple HTTP protocol communication between services at the application layer instead of HTTPS: The Service Mesh proxy will manage HTTPS encapsulation at the sending end and implement TLS termination at the receiving end, allowing application components to use simple HTTP, gRPC, or other protocols without having to worry about encryption in transit. The Service Mesh proxy will implement encryption for the application.
  • Enforce security policies: Agents know that services can access other services and endpoints and reject unauthorized traffic.
  • Circuit breaker: Access the overload service or endpoint rollback with high latency to prevent more requests from being sent to the service or endpoint.
  • Delaid-aware load balancing: Instead of using polling load balancing (ignoring each target delay), using smarter load balancing based on the response time of each back-end target should be a very important feature of a modern service grid.
  • Liability balancing queue depth: Routes the current request based on the least number of visits. The Service Mesh knows exactly what requests have been sent and whether they are being processed or completed. Based on this logic, it sends the new incoming request to the node with the smallest queue for processing.
  • Request routing: Routing requests based on a specific HTTP header tag to a specific node behind liability balancing. Allow simple Canary deployment tests and other creative use cases. This is one of the most powerful features that Service Mesh provides.
  • Health check, retry budget and expel nodes that behave abnormally
  • Metrics and tracking: Reports requests, latency metrics, success rates, and error rates for each target.

There are two ways to deploy a Service Mesh:

As a host sharing agent, DaemonSet in Kubernetes terminology. If there are many containers on the same host and it is possible to leverage connection pooling to improve throughput, such deployments will use fewer resources. However, a failure in an agent will bring down an entire container queue on that host, not a single service (if it is used as a Sidecar agent).

As a container sidecar, the proxy is injected into each POD definition to run with the master service. Using a more “heavyweight” agent like Linkerd, this deployment would add about 200MB of memory per POD. But with the newer Conduit, each pod is only about 10MB. Conduit doesn’t have all of Linkerd’s features yet, so we haven’t seen a final comparison of the two. In general, “one sidecar per pod” is a good choice to try to limit agent failures to a single POD and not affect other pods on the same host.

Why was the Service Mesh architecture created? Let’s look at two diagrams of different types of application architectures to illustrate requirements.

The first example is an old MVC architecture based Web service that works as a singleton architecture All-in-one application. It may serve millions of requests per day, but there is no complex functionality, and the communication of the underlying services is straightforward: Nginx equalizes all traffic to the Apache instance, and Apache fetches data from the database/file store and returns the request page. The architecture used in this example does not benefit much from the service grid. Because a single application does not use service invocation, all functions are coupled together and the developer does not develop code to handle routing and communication between services. In a monolithic application, all the core components are on the same machine, do not communicate over the network, and there is no REST API or gRPC. All of the “business logic” is in one application, deployed as a whole on each Apache Web server.

The second example is an application based on a modern microservices architecture with a lot of process and behind-the-scenes logic. It does a lot of things like learning visitor patterns and preferences to personalize their experience on the site, notifying users of updates on their favorite topics, and more. You can imagine the many complex processes that occur between all these microservices, spread over thousands of containers and hundreds of nodes. Please note that our illustrations are very simplified. In fact, we’ve simplified a lot of the details in the real architecture that shows large cloud-native applications.

In this example application each of our microservices has some code to handle communication with each other, setting up retry policies, timeouts, exception handling, and so on (in case of network failure). We also see a multilingual environment where different teams develop their own service components using Scala, Golang, Node.js, or Python. All components can communicate with each other through the REST API or gRPC, each team spent time and energy in their own components to realize communication logic, use their own language choice, so they cannot share each other’s library and function, at least can save time and use the insert the unified solution of all components in the application as a dependency. In addition, querying functions of the service discovery mechanism (such as Consul or ZooKeeper) or reading some configuration externally passed to an application requires reporting latency and response-related indicators to Prometheus/InfluxDB. This includes information about the cache response time (redis or memcached cache), which is usually on another node or as an entire separate cluster and can be overloaded and cause high latency. With the exception of the team explosion log and approaching deadlines, all of these are part of the service code and need to be maintained. Developers do not want to spend time on the operation-related parts of the code, such as adding distributed tracking and monitoring metrics (not a fan of troubleshooting and analysis) or dealing with possible network failures, implementing rollback and retry budgets.

In this environment, Service Mesh saves development time and allows centralized control of communication in a unified manner. So how do we change this communication layer mechanism into a unified “Service Mesh”? We took the code for communication between microservices, routing, service discovery, latency metrics, request tracking, and similar microservices and completely pulled it out of the service, and had a singleton process that could handle this and more to handle the common logic for each microservice. Fortunately, these tools already exist; companies like Twitter, Lyft, and Netflix have open-source their own tools, and other contributors can build their own based on these libraries. So far we have Linkerd, Conduit, Istio and Envoy to choose from. Istio is built on Envoy, which is a control plane, and both Envoy and Linkerd can be used as proxies for its data plane. The control plane allows cluster operations to set up specific Settings centrally, then distribute them across the data plane agents and reconfigure them.

Linkerd and Conduct were developed by Buoyant, a group of engineers who used to work at Twitter. Linkerd is currently one of the most commonly used Service meshes, and Conduit is a lightweight Sidecar built from the ground up specifically for Kubernetes, which is very fast and well suited to the Kubernetes environment. At the time of writing, Conduit is still in an active development phase.

Let’s take a look at the change from application-dependent communication logic to the “Service Mesh” architecture.

Most notably, all agents can be configured and updated in the same place, and we can configure specific rules across thousands of agents through their control plane (or through configuration files in some repository, depending on the tool chosen and deployment method). So routing, load balancing, metric collection, security policy enforcement, circuit breakers, data transmission encryption, all follow a strict set of rules applied by the cluster administrator.

Is Service Mesh right for you?

At first glance, this new concept of separating microservice communication mechanisms into a separate architectural layer raises the question: is it worth configuring and maintaining a whole set of complex AD hoc agents? To answer this question, you need to estimate the size and complexity of the application. If you only have a few microservices and data store endpoints (for example, one ElasticSearch cluster for records, one Prometheus cluster for metrics, and a database with two or three main application data), implementing a service grid may not be necessary for your environment. However, if your application components are distributed over hundreds or thousands of nodes and have 20+ microservices, you will benefit greatly from using Service Mesh.

Even in smaller environments, if you want to separate retry and disconnect behavior from the application itself (for example, from the code that manages connections and exits to avoid retries causing other services or database overloads), you can use the service grid to remove this network logic maintenance burden from your application developers. You can use a service grid to reduce the burden of maintaining network logic for application developers. As a result, they will focus more on business logic rather than getting involved in managing and adjusting the communication between all microservices.

Once the service network is configured, the operations team can centralize the tuning to minimize the effort spent on application component communication.

Istio is a perfect example of centralizing all of the Service Mesh features. It has several “master components” that manage all of the “data plane” proxies (these proxies can be envoys or Linkerds, but by default, they are envoys, which we will use in this tutorial, Linkerd integration is still a work in progress).

The following is a diagram of the Istio architecture on the official website:

Istio-auth has since been renamed Citadel.

You can read more in the official documentation, but for the purposes of this tutorial, here is a summary of Istio components and their capabilities:

Control plane

  • Pilot: Provides routing rules and service discovery information to an Envoy proxy.
  • Mixer: Collects telemetry from each Envoy proxy and executes access control policies.
  • Citadel: Provides inter-service and user-to-service authentication and encrypts unencrypted traffic based on TLS. Access to audit information (work in progress) will be available soon.

The data plane

  • Envoy: A feature-rich agent managed by the control plane component. Intercepts traffic to and from the service and applies the required routing and access policies according to the rules set in the control plane.

The tutorial

In the following tutorial, we will use Istio to demonstrate one of the most powerful features: “Routing on request.” As mentioned earlier, it allows specific requests for selected HTTP header tags to be routed to specific targets that can only be implemented through layer 7 proxies. No Layer 4 load balancer or agent can implement this functionality.

For this tutorial, we assume that you are running a Kubernetes cluster (tip: you can follow these instructions or start a new cluster in a few minutes, or set up a local cluster in a few simple steps using “kublr-in-a-box”). For this tutorial, a small cluster with 1 master node and 2 working nodes should suffice.

Tutorial Stage 1: Install the Istio control plane

Install the control plane in the Kubernetes cluster by following the official tutorial. This installation step depends on your local environment (Windows, Linux, or MAC), so we cannot duplicate the setup application using local standard commands. We use two CLI tools, istiOCT and Kubectl, to manage the library Netes and IStio. Please install the following concise description to do so (if this doesn’t work, step by step with the official instructions) :

  1. Set up the Kubernetes cluster (use the method listed above, or use your existing test/development cluster)

  2. Download Kubectl and configure it to the environment (use it to manage your Kubernetes environment)

  3. Download istioctl and configure the environment variable (use it to inject an Envoy proxy into each POD to set routing and policy). Here are simple installation instructions:

(1) Execute on the MAC or Linux command line

curl -L https://git.io/getLatestIstio | sh -
Copy the code

(2) Download istio.zip on Windows, unzip the file, and configure the file path into your environment variables

(3) Switch to the file decompression path in the decompression environment and run the command

kubectl apply -f install/kubernetes/istio-demo.yaml
Copy the code

Another way to install your Kubernetes cluster environment is to use Kublr — a simple way is to pull up a Kubernetes cluster from a cloud provider (Alibaba Cloud, Tencent Cloud, AWS, Azure, GCP or Quick Start). Kublr.

Copy the %USERPROFILE%/.kube/config file to your host directory (~/.kube/config) and go to the following page:

Log in to the Kubernetes Dashboard using the administrator account and password in the configuration file. You should be able to see the dashboard. Click on the namespace shown in default in the sidebar:

Istio components will be installed into their own namespace. Go to the isTIO download directory and run the following command:

kubectl apply -f install/kubernetes/istio-demo.yaml
Copy the code

You will see a list of components created, see the official documentation for details, or you can open a YAML file to view the components, where each resource is recorded. We can then browse through the namespace and see all the content that has been successfully created:

Click istio-system during component creation to see if there is an error or issue, it should look something like this:

You can see that there are 50 events, and you can scroll through the screen to see the “success” status and note that there may be some errors. If there is an error, you can submit an issue on Github.

We need to find the istio-ingress service entry to understand where traffic is sent. Go back to the sidebar of the Kubernetes Dashboard and jump under the namespace istio-System. If it is not visible under this namespace after creation, refresh the browser and try. Click “Services” to find the External endpoint, as shown below:

In our example, this is the AWS Elastic load balancer, but you might see the IP address, depending on the cluster Settings. We will use this endpoint address to access our demo Web service.

Tutorial Stage 2: Demonstrate Web services using Envoy Sidecar deployment

This is the most fun part of the tutorial. Let’s examine the routing capabilities of the Service Mesh. First we will publish our demo instance service in blue and green as before. Copy the following into a file named my-website.yaml.

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: web-v1
  namespace: default
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: website
        version: website-version-1
    spec:
      containers:
      - name: website-version-1
        image: aquamarine/kublr-tutorial-images:v1
        resources:
          requests:
            cpu: 0.1
            memory: 200
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: web-v2
  namespace: default
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: website
        version: website-version-2
    spec:
      containers:
      - name: website-version-2
        image: aquamarine/kublr-tutorial-images:v2
        resources:
          requests:
            cpu: 0.1
            memory: 200
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: web-v3
  namespace: default
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: website
        version: website-version-3
    spec:
      containers:
      - name: website-version-3
        image: aquamarine/kublr-tutorial-images:v3
        resources:
          requests:
            cpu: 0.1
            memory: 200
---
apiVersion: v1
kind: Service
metadata:
  name: website
spec:
  ports:
  - port: 80
    targetPort: 80
    protocol: TCP
    name: http
  selector:
    app: website
Copy the code

Note the presence of the label “app” (which is used for request tracking) and the value of “spec.ports. Name” in the service must be spelled correctly (HTTP, http2, GRPC, Redis, mongo) when using your POD with Envoy proxies. Enovy will proxy these services just like normal TCP. You cannot use L7 routing for these services. A POD only provides the same service in a cluster. You can see from the file that there are three versions of this service (V1 /v2/v3). There is a Deployment for each version of the service.

Now let’s add the Envoy proxy configuration for this POD to this file. Using the “istioctl kube-inject” command, which will generate a new YAMl file for Kubectl to deploy using additional components containing Envoy proxies, run the command:

 istioctl kube-inject -f my-websites.yaml -o my-websites-with-proxy.yaml
Copy the code

The output file will contain additional configuration. You can view the my-website-with-proxy. yaml file. This command takes the predefined ConfigMap “IStio-Sidecar-Injector” (which was defined before isTIO was defined). And added the required Sidecar configuration and parameters to our Deployment definition. When we deploy the new file “my-website-with-proxy. yaml”, each pod will have two containers, one for our instance application and one Envoy proxy. Run the following command to deploy our server and Sidecar:

kubectl create -f my-websites-with-proxy.yaml
Copy the code

If it works as expected, you should see this output:

deployment "web-v1" created
deployment "web-v2" created
deployment "web-v3" created
service "website"Created Let's inspect the Pods to see that the Envoy Sidecar is present: Kubectl get PodsCopy the code

We can see that each pod has two containers, a website container and a proxy Sidecar:

We can view the Envoy run log by executing the following command:

kubectl logs <your pod name> istio-proxy
Copy the code

You’ll see a lot of output, and the last few lines look something like this:

add/update cluster outbound|80|version-1|website.default.svc.cluster.local starting warming
add/update cluster outbound|80|version-2|website.default.svc.cluster.local starting warming
add/update cluster outbound|80|version-3|website.default.svc.cluster.local starting warming
warming cluster outbound|80|version-3|website.default.svc.cluster.local complete
warming cluster outbound|80|version-2|website.default.svc.cluster.local complete
warming cluster outbound|80|version-1|website.default.svc.cluster.local complete
Copy the code

This means that Sidecar works well in pod.

Now we need to deploy minimal Istio configuration resources and need to route traffic to our Service and POD. Save the following file to the website-routing.yaml file.

---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: website-gateway
spec:
  selector:
    # Which pods we want to expose as Istio router
    # This label points to the default one installed from file istio-demo.yaml
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    # Here we specify which Kubernetes service names
    # we want to serve through this Gateway
    hosts:
    - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: website-virtual-service
spec:
  hosts:
  - "*"
  gateways:
  - website-gateway
  http:
  - route:
    - destination:
        host: website
        subset: version-1
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: website
spec:
  host: website
  subsets:
  - name: version-1
    labels:
      version: website-version-1
  - name: version-2
    labels:
      version: website-version-2
  - name: version-3
    labels:
      version: website-version-3
Copy the code

This file defines Gateway, VirtualService, and DestinationRule. These are custom Istio resources for managing and configuring the ingress behavior of isTIO-IngressGateway Pods. We will describe them in more depth in the next tutorial, which will explain the technical details of Istio configuration. Now, deploy these resources to be able to access our sample site:

kubectl create -f website-routing.yaml
Copy the code

The next step is to visit our demo website. We deployed three versions, each displaying different page text and colors, but for now we can only access V1 through Istio ingress. Let’s access our service to make sure the Web service is deployed.

View the external endpoint by running the following command:

kubectl get services istio-ingressgateway -n istio-system
Copy the code

Or browse the istio-ingressGateway service to find it, as shown below (we also saw it at the beginning of this tutorial)

Click it to access the external node. You might see multiple links, because one links to HTTPS and the other links to the HTTP port of the load balancer. If so, just use the HTTP link, because we didn’t set TLS for this tutorial, and you should see the v1 page of the demo site:

Explicitly configure Kubernetes Service to point to a single deployed IStio VirtualService for our demo example. It specifies that Envoy will route all traffic to the site to version V1 (without Envoy routing, Kubernetes will poll for requests on all three versions of Pods). You can change the version of the site we see by changing the following parts of the VirtualService configuration and redeploying it:

  http:
  - route:
    - destination:
        host: website
        subset: version-1
Copy the code

Subset is the correct place for us to select the DestinationRule to route to. We will delve into these resources in the next tutorial.

Typically, when a new version of the application needs to be tested with a small amount of traffic (Canary deployment). The Vanilla Kubernetes method uses a new Docker image with the same POD label to create a second Deployment and route traffic to the service with this label label. It is not as flexible as Istio solutions. You cannot easily redirect 10% of your traffic to a new Deployment (to achieve precise 10%, you need to maintain pod replication ratios between two Deployments according to the desired percentage, such as 9 “V1 pods” and 1 “V2 pod”, Or 18 “V1 pods” and 2 “V2 pods”) and cannot use HTTP header tags to route requests to a particular version.

In our next article, practicing canary deployment with Istio, we will customize HTTP header requests to the correct version of the service. By doing so, we will have complete control over the flow and analyze distributed tracking results in the Zipkin dashboard.

The deployment of Istio relies heavily on Kubernetes. In order to better understand the principle of Kubernetes, I recommend to learn an in-depth analysis of Kubernetes by Zhang Lei, geek Time.

ServiceMesher community information

Wechat group: Contact me to join the group

Community official website: www.servicemesher.com

Slack:servicemesher.slack.com requires invitation to join

Twitter: twitter.com/servicemesh…

GitHub:github.com/servicemesh…

For more ServiceMesh consultation, follow the wechat public account ServiceMesher.