Service Proxies: The Layer Nobody Talks About Until Something Breaks

A proxy sits between services and manages the traffic between them. Get it right and you have observability, resilience, and traffic control for free. Get it wrong and you have an extra hop that breaks in ways you did not anticipate.

Service Proxies: The Layer Nobody Talks About Until Something Breaks

A proxy sits between services and manages the traffic between them. In the early days of microservices, teams built their own retry logic, timeout handling, and circuit breakers into every service. Every team in the company reimplemented the same patterns in different languages with different bugs.

The service proxy is the answer to that problem: extract the common network concerns into a sidecar process that every service runs alongside it, and implement those concerns once, correctly.

What a Proxy Does

At the most basic level, a proxy receives a request, forwards it to a destination, receives the response, and forwards it back. Nothing about that is interesting.

What is interesting is everything a proxy can do around that forwarding:

Retries: if the backend returns a 503, retry once. If it returns a 429, back off exponentially and retry. Implementing this correctly in application code requires careful handling of idempotency — you cannot blindly retry a non-idempotent operation. Envoy's retry policy lets you configure which HTTP response codes trigger retries and the retry budget (maximum number of retries as a fraction of total requests).

Timeouts: every outbound call should have a deadline. Without a deadline, a slow backend accumulates blocked goroutines or threads in the caller, eventually exhausting the connection pool and cascading the failure upstream. Proxies can enforce timeouts globally without every team having to remember to set them.

Circuit breaking: if a backend's error rate exceeds a threshold, stop sending it traffic for a period. This prevents a degraded backend from receiving full traffic and potentially making things worse. Envoy implements circuit breaking per cluster with configurable thresholds on connection count, request count, and pending request count.

Observability: because every request passes through the proxy, the proxy is a natural place to emit metrics and traces. Envoy emits detailed per-route statistics to StatsD, Prometheus, or other sinks. In a cluster running Envoy sidecars on every pod, you get service-to-service latency, error rate, and throughput metrics for free, without any application code changes.

mTLS: mutual TLS authentication between services. Without it, any process on your network that knows service B's address can call it. With mTLS, service B only accepts connections from clients presenting a certificate signed by a trusted CA. Istio and Linkerd use their proxies to enforce mTLS transparently, rotating certificates automatically.

The Sidecar Pattern

The dominant deployment pattern for service proxies is the sidecar: every application pod runs a proxy container alongside the application container. Traffic to and from the application is redirected through the proxy via iptables rules. The application knows nothing about the proxy.

This architecture is the foundation of service meshes. Istio's data plane is Envoy running as a sidecar on every pod. Linkerd uses its own purpose-built proxy. Both provide the same basic capabilities: mTLS, retries, timeouts, circuit breaking, observability.

The control plane (Istio Pilot, Linkerd's control plane) pushes configuration to the sidecar proxies. When you apply a VirtualService resource in Istio, the control plane translates it into Envoy's route configuration API (xDS) and pushes it to every relevant proxy. The application does not restart. The proxy reconfigures itself dynamically.

This dynamic reconfiguration is one of Envoy's most important features. The xDS (discovery service) protocol is a well-defined API for pushing routes, clusters, listeners, and endpoints to Envoy at runtime. You can implement a custom control plane that drives Envoy by implementing the xDS API. This is exactly what Istio does.

Gateway vs Sidecar

There are two primary deployment modes for proxies: as a gateway (edge proxy) and as a sidecar.

A gateway proxy sits at the edge of your cluster and handles traffic from external clients. It terminates TLS, routes requests to the appropriate backend service, handles authentication, and enforces rate limits. Nginx, Kong, Traefik, and AWS API Gateway are all edge proxies.

A sidecar proxy handles service-to-service traffic inside the cluster. The edge proxy sees all external traffic; the sidecar sees all internal traffic. Together they give you complete visibility and control over all traffic in your system.

Most mature systems have both. External traffic enters through the gateway. Inside the cluster, service meshes with sidecars handle the east-west traffic.

Writing Your Own Proxy

The Proxies track asks you to implement a reverse proxy with configurable routing. You handle incoming requests, parse the routing configuration, select the appropriate backend, forward the request, and return the response.

The implementation details matter. How do you handle the proxy when the backend returns a streaming response? How do you propagate headers (especially tracing headers like X-Trace-ID) without losing them? How do you handle WebSocket upgrades?

These questions come up when you try to build anything real with a proxy, and they are the questions that Nginx's and Envoy's architects had to answer. Implementing a proxy from scratch forces you to confront them concretely rather than abstractly.

Ready to build it? The Proxies track builds a working reverse proxy with routing rules and health checking. You will handle request forwarding, header propagation, timeout enforcement, and basic retry logic. The same architecture is in Nginx, Envoy, and every service mesh sidecar.

Build it yourself

Reading about distributed systems is useful. Building them is how you actually learn.

Start the Proxies track