Under the banner of digital transformation, one of the big changes in IT is the decomposition of large monolithic applications into microservice architectures, which are small, discrete units of functionality, and these applications run in containers. Packages containing all service code and dependencies are isolated and can be easily migrated from one server to another.

Containerized architectures like this are easy to scale and run in the cloud and can iterate and roll out each microservice quickly. However, as applications grow larger and run multiple instances simultaneously on the same service, communication between microservices becomes more complex. This problem will be solved with the advent of Service Mesh, an emerging architectural form that aims to connect these microservices in a way that reduces administrative and programming overhead.

What is a Service Mesh?


The most widely accepted definition of Service Mesh is that it is a way of controlling how different parts of an application share data with each other. This description covers all aspects of the Service Mesh. In fact, it sounds more like middleware that most developers are familiar with from client-server applications.

Service Mesh is also unique in that it ADAPTS to the unique nature of distributed microservice environments. In large-scale applications built on microservices, there are many established service instances that run across local and cloud servers. All these moving parts obviously make it difficult for individual microservices to find the other services they need to communicate with. Service Mesh automatically processes discovery and connection services in a short period of time without the need for developers and individual microservices to match themselves.

We can equate Service Mesh to layer 7 of the OSI network model for software-defined networks (SDN). Just as SDN creates an abstraction layer where network administrators don’t have to deal with physical network connections, service Mesh decouples the underlying infrastructure of the applications you interact with in an abstraction architecture.

As developers began to grapple with the problem of truly large distributed architectures, the concept of Service Mesh emerged in time. The first project in this area was Linkerd, which started as an offshoot of an internal Project at Twitter. Istio is another popular Service Mesh project that originated with Lyft and is now supported by many enterprises.

Service Mesh load balancing


One of the key functions of the Service Mesh is load balancing. We often think of load balancing as a network function — you want to prevent the server or network link from being overwhelmed by traffic, so you route your packets accordingly, and Service Mesh does a similar thing at the application level.

Essentially, one of the jobs of the Service Mesh is to track which instances of the various microservices distributed across the infrastructure are the “healthiest.” It might investigate them to see how they work or track which instances are slow to respond to service requests and send subsequent requests to other instances. In addition, Service Mesh does a similar job for network routing. If it finds that messages take too long to arrive, service Mesh will use other routes to compensate. These slowdowns could be due to problems with the underlying hardware, or simply the service being overloaded with requests or unable to handle them. However, the Service Mesh will find another instance of the same service and route it to replace the slow responding instance, efficiently utilizing the resources of the entire application.

Service mesh vs Kubernetes


If you’re a little familiar with container-based architecture, you might be wondering if Kubernetes, the popular open source container choreography platform, fits into this situation. After all, isn’t Kubernetes managing how your containers communicate with each other? You can think of the Kubernetes “service” resource as a very basic Service mesh because it provides a round-robin scheduling balance of service discovery and requests. But a full Service mesh provides richer capabilities, such as managing security policies and encryption, “routing” to suspend requests to slow responding instances, and load balancing as described above.

Keep in mind that most service meshes do require a choreography system like Kubernetes. Service Mesh is an extension, not a replacement for an orchestration platform.

Service Mesh VS API gateway


Each microservice provides an API that acts as a means for other services to communicate with it. This raises the question of the difference between Service Mesh and other more traditional forms of API management, such as API gateways. The API gateway sits between a set of microservices and the “outside” world, routing service requests as needed so that the requester can complete the request without knowing that it is working with a microservices-based application. The Service Mesh mediates requests within a microservice application, and the various components are fully aware of their environment.

On the other hand, the Service Mesh is used to optimize east-west traffic within the cluster, and the API gateway is used to optimize north-south traffic to and from the cluster. But Service Mesh is still in its early stages and still evolving. Many service meshes, including Linkerd and Istio, now provide north-south functionality.

Service mesh architecture


The concept of a Service mesh is relatively new, and there have been a number of different approaches to the problem of a Service mesh, such as managing microservice communications. At present, three possible locations of the communication layer created by the Service mesh are determined:

  • Library imported by each microservice
  • The node agent that provides services to all containers on a particular node
  • The Sidecar container that runs with the application container


Sidecar-based patterns are currently one of the most popular patterns for Service Mesh, so much so that they have become synonymous with Service Mesh. While this is not a rigorous statement, Sidecar has attracted a lot of attention, and we’ll examine this architecture in more detail below.

Sidecar


What does it mean for the Sidecar container to run with your application container? Each microservice container in this type of service mesh has another proxy container corresponding to it. All requirements for inter-service communication are abstracted out of microservices and put into sidecars.

This seems complicated, since you’re effectively doubling the number of containers in your application. But the design pattern you’re using is critical to simplifying distributed applications. By putting all of your networking and communication code in a separate container as part of your infrastructure and freeing developers from having to implement it as part of your application.

Essentially, what you’re left with is a microservice focused on business logic. The microservice does not need to know how to communicate with all the other services in the environment in which it runs. It just needs to know how to communicate with Sidecar, and Sidecar does the rest.


Service Mesh Star projects: Linkerd, Envio, Istio, Consul


So with all that said, what is a service mesh available? At the moment, there are no fully off-the-shelf commercial products in this area. Most service Mesh projects are open source projects, which need to go through certain operation steps to achieve. Currently, well-known projects include:

  • Linkerd: Released in 2016, it’s the oldest of these projects. Linkerd was spun out of the Library developed by Twitter. Conduit, another weight player in this area, has been involved in the Linkerd project and forms the basis of Linkerd 2.0.


  • Envoy: Created by Lyft, Envoy occupies a portion of the “data plane” and matches it in order to provide a full service mesh.


  • Istio: Developed by Lyft, IBM, and Google, Istio can easily add features like load balancing and authentication to microservices without modifying their source code. It can control all traffic by controlling proxy services such as Envoy. In addition, Istio provides fault tolerance, Canary deployment, A/B testing, monitoring, and support for custom components and integration. Support for Istio began in Rancher 2.3 Preview2. Users can start Istio directly from the UI and inject automatic Sidecar for each namespace. The Rancher comes with a built-in KiAli-enabled dashboard that simplifies Istio installation and configuration. All of this makes deploying and managing Istio simple and fast.


  • HashiCorp Consul: Launched with Consul 1.2, a feature called Connect adds service encryption and identification-based authorization to HashiCorp’s distributed system for service discovery and configuration.


Which Service Mesh is right for you? A comprehensive comparison is beyond the scope of this article. But all of these products have been proven in large, demanding environments. Currently, Linkerd and Istio contain the richest feature set, but things are still evolving rapidly and it’s too early to tell.