Anjul Sahu


Service Mesh - Linkerd vs Istio

Posted at — Jun 22, 2020

This blog post is updated on 09-March-2021.

From the latest CNCF annual survey of 2020, it is pretty clear that a lot of people are showing high interest in service mesh in their project and many are already using in production. Nearly 69% are evaluating Istio, and 64% are evaluating Linkerd. Both projects are cutting edge and very competitive, makes a tough choice to select one. In this blog post, we will learn about Istio and Linkerd architecture, their moving parts, and compare Linkerd vs Istio offerings to help you make an informed decision.

CNCF Service Mesh Survey

What is a Service Mesh?

Over the past few years, Microservices architecture has become a more popular style in designing software applications. In this architecture, we breakdown the application into independently deployable services. The services are usually lightweight, polyglot in nature, and often managed by various functional teams. This architecture style works well until a certain point when the number of these services becomes higher, difficult to manage and they are not simple anymore. This leads to challenges in managing various aspects like security, network traffic control, and observability. A service mesh helps address these challenges.

The term service mesh is used to describe the network of microservices that make up such applications and the interactions between them. As the number of services grows in size and complexity, it becomes harder to scale and manage. A service typically offers service discovery, load balancing, failure recovery, metrics, and monitoring. A service mesh also often has more complex operational requirements, like A/B testing, canary rollouts, rate limiting, access control, and end-to-end authentication. A service mesh provides an easy way to create a network of services with load balancing, service-to-service authentication, monitoring, and more, with few or no code changes in service code.

Let’s go through the architecture of Istio and Linkerd. Note that both projects are evolving fast and this article is based on Istio version 1.9.1 and Linkerd version 2.10.

What is Istio?

Istio is an open-source platform that provides a complete solution as service mesh providing a uniform way to secure, connect, and monitor microservices. It is backed by industry leaders like IBM, Google, and Lyft. Istio is one of the most popular solution with advanced offerings suitable for all sizes of enterprises. It is a first-class citizen of Kubernetes and designed as a modular platform-independent system. For a quick demo of Istio, please refer to our previous post.

Let’s look at Istio Architecture

Istio Mesh is logically split into a data plane and control plane.

  • Data plane is composed of proxies (envoy) as sidecars. These proxies mediate and control all network communication between microservices and also collect telemetry on all mesh traffic.
  • Control plane manages and configures the proxy to route traffic

Architecture

What is Envoy?

Envoy is a high-performance proxy written by Lyft in C++ language, which mediates all inbound and outbound traffic for all services in the service mesh. It is deployed as a sidecar proxy with the service.

Envoy SideCar

Envoy provides the following features:

  • Dynamic service discovery
  • Load balancing
  • TLS termination
  • HTTP/2 and gRPC proxies
  • Circuit breakers
  • Health checks
  • Staged rollouts with percentage-based traffic split
  • Fault injection
  • Rich metrics
  • Pluggable extensions model based on WebAssembly that allows for custom policy enforcement and telemetry generation for mesh traffic.

In the newer version of Istio, sidecar proxy has taken the additional responsibility for what Mixer was doing. In previous releases of Istio (<1.6), Mixer was used to collect telemetry information from the mesh.

Istiod provides service discovery, configuration and certificate managmeent. It includes Pilot, Citadel and Galley.

Pilot

Pilot provides service discovery for the sidecar proxies, traffic management capabilities, and resiliency. It converts high-level routing rules that control traffic behavior into envoy specific configurations.

Citadel

Citadel enables strong service-to-service and end-user authentications with built-in identity and credential management. It can enable authorization and zero-trust security in the mesh.

Galley

Galley is Istio configuration validation, ingestion, processing, and distribution component.

Core Features

  • Traffic Management: Intelligent traffic routing rules, flow control, and management of service level properties like circuit breakers, timeouts, and retries. It let us set up A/B testing, canary rollouts, staged rollouts with percentage-based traffic splits easily.
  • Security: Provides secure communication channels between services and manages authentication, authorization, and encryption at scale.
  • Observability: Robust tracing, monitoring, and logging features provide deep insights and visibility. It helps in efficient issue detections and resolution. Istio also has add-ons infrastructure services that support the monitoring of microservices. Istio integrates with applications such as Prometheus, Grafana, Jaeger and the service mesh dashboard Kiali.

What is Linkerd?

Linkerd is an open-source light weight service mesh designed for Kubernetes by Buoyant. Initially linkerd proxy was written in Java which was then rewritten completely in Rust language to make it ultralight and performant. Similar to other service meshes, it provides you runtime debugging, observability capability, reliability, and security without requiring code changes in your distributed application.

Let’s look at Linkerd architecture

Linkerd has three components: a UI, a data plane, and a control plane. It works by installing lightweight transparent proxies next to each service instance.

Linkerd Control Plane

Control Plane

Set of service that provides the core functionality of the mesh. It aggregates telemetry data, provides user-facing API, provides control data to data plan proxies. Below are the components of the control plane.

  • Controller: It consists of a public API container that provides an API for CLI and Dashboard.
  • Destination: Each proxy in the data plane looks into this component to look up where to send the request. It has the service profile information used for per-route metrics, retries, and timeouts.
  • Identity: It provides a Certificate Authority that accepts CSRs from proxies and returns certificates signed with the correct identity. It provides mTLS functionality.
  • Proxy Injector: It is an admission controller which looks for annotation linkerd.io/inject: enabled and mutates the pod specification to add both an initContainer as well as a sidecar containing the proxy itself.
  • Service Profile Validator: It is also an admission controller that validates the new service profiles before they are saved.
  • Tap: It receives requests from the CLI or dashboard to watch requests and responses in real-time to provide observability in the applications.
  • Web: It provides a web dashboard
  • Grafana: Linkerd provides out of the box dashboards through Grafana.
  • Prometheus: It collects and stores all Linkerd metrics by scraping proxy’s /metrics endpoint on port 4191. The metrics are scraped every 10 seconds. T

Data Plane

The Linkerd data plane consists of the lightweight proxies which are deployed as sidecar containers with each instance of the service container. The proxy is injected during the initialization phase of the pod which has the specific annotation (see Proxy Injector above).

Proxy

The proxy is very lightweight and performant since 2.x when it was completely rewritten in Rust. These proxies intercept communication to and from each Pod to provide instrumentation and encryption(TLS) without any change in application code.

Proxy features:

  • Transparent, zero-config proxying for HTTP, HTTP/2, and arbitrary TCP protocols.
  • Automatic Prometheus metrics export for HTTP and TCP traffic.
  • Transparent, zero-config WebSocket proxying.
  • Automatic, latency-aware, layer-7 load balancing.
  • Automatic layer-4 load balancing for non-HTTP traffic.
  • Automatic TLS.
  • An on-demand diagnostic tap API.

The proxy supports service discovery via DNS and destination gRPC API.

Linkerd Init

To make the working truly transparant, Linkered uses linkerd-init containers that are executed before every other container in Kubernetes where the Linkerd sidecar is configured. This init container executes iptables and configure the flow of traffic.

There are two main rules that iptables uses:

  1. Any traffic that is sent to Pod’s external IP address is forwarded to specific port on the proxy (4143). By setting SO_ORIGINAL_DST on the socket, the proxy is able to forward the traffic to the original destination port that the application is listening on.
  2. Any traffic that is originating from the Pod being sent to external IP address is forwarded to specific port on the proxy (4140) because SO_ORIGINAL_DST was set on the socket, the proxy is able to forward the traffic to the original recipient. This avoids traffic loop because iptables rules explicitly skips the proxy’s UID.

The other two components in the puzzle are CLI and dashboard which is used to interact, manage and observe the services.

Comparison: Linkerd vs Istio

Please keep in mind that both the projects are adding new features often and this is subject to change.

FeaturesIstioLinkerd
Ease of InstallationIstio has improved in this area recently and made it easier to tryRelatively easier to adapt due to out of the box configuration
PlatformKubernetes, VMsKubernetes
Supported ProtocolsgRPC, HTTP/2, HTTP/1.x, Websockets, and all TCP trafficgRPC, HTTP/2, HTTP/1.x, Websockets, and all TCP traffic
Ingress ControllerEnvoy, Istio gateway itselfAny – Linkerd doesn’t provide ingress capability by itself
Multi-Cluster Mesh and Expansion SupportSupport for multi-cluster deployment in stable release with various configuration options and extension of mesh outside the Kubernetes clusters possiblemulti-cluster deployment is stable
Service Mesh Interface (SMI) CompatibilityThrough third party CRDNative for traffic splitting and metrics, not for traffic access control
Monitoring featuresFeature-richFeature-rich
Tracing SupportJaeger, ZipkinAll backends supporting OpenCensus
Routing FeaturesVarious load balancing algorithms (Round-Robin, Random Least Connection), Supports percentage-based traffic splits, Supports header- and path-based traffic splitsSupports EWMA (Exponential weighted moving average) load balancing algorithm, supports percentage-based traffic split through SNI
ResilienceCircuit breaking, Retries and Timeouts, fault-injection, delay injectionretries and timeouts, fault injection, delay injection is not possible, Circuit breaking support is not there yet in v2, see issue#2846
SecuritymTLS support for all protocols, external CA certificate/Key is possible, Supports authorization rules.mTLS supported for most TCP traffic (also see caveats, external CA/key is possible but no support for authorization rules yet issue#3342
PerformanceWith the recent release Istio is getting better with resource footprint and latency is improved.Linkerd is designed to be very light, as per some third party benchmark, it is lean and slightly faster than Istio.
Enterprise SupportAvailable from various vendors such as AspenMesh, solo.io, and TetrateFull enterprise-class engineering, support, and training available by Buoyant who developed the OSS version of Linkerd

Conclusion

Service meshes are becoming an essential building block in the cloud-native solutions and in the microservice architecture. It allows you to do all heavy lifting jobs like traffic management, resiliency and observability and relieve developers to focus on the business logic.  Istio and Linkerd, both are mature and are being used in production by various enterprises. Planning and analysis of your requirements are essential in picking up which service mesh to use. Please spend sufficient time during the analysis phase because it is complex to move from one to another later in the game.

Comparing two technologies with such depth and breadth of things they do and are ever-evolving features is not possible in an article. Also when choosing technology as complex and as critical as Service Mesh and more than just technology, the context in which it will be used is far more important. Without that context, it is hard to say A is better than B because the answer is really and it depends. I loved the simplicity of LinkerD with getting started and also with later managing the service mesh. Also, LinkerD has been hardened over years with users from enterprise companies. There might be some features that seem lucrative in one/ but one should check if the other has that feature planned in near future and make an informed decision based on not just theoretical evaluation but by trying out in a proof of concept sandbox. This proof of concept should focus on ease of use, feature match, and more importantly the operational aspect of technology. It is relatively easy to introduce a technology but the hard and long effort is spent in running and managing it through its lifecycle.

Also, if you get time, please read William Morgan’s Service Mesh Manifesto.

Hope this Linkerd vs Istio comparison was helpful to you to make an informed decision. Do let us know your thoughts - you can start a conversation with me on Twitter.

References