ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/tracer
Tracks/The Tracer
25

The Tracer

Advanced
Operations|10 tasks

When something breaks at 3 AM in a system with 100 services, how do you find it in under 5 minutes? Build distributed tracing, metrics collection, time-series storage, and alerting systems.

Subtracks & Tasks

Concepts Covered

distributed tracingtrace contextW3C traceparentspantrace treespan lifecyclespan kindspan eventsspan linksdurationtrace collectorspan aggregationtrace samplinglate spanstrace queriesbottleneck detectioncritical patherror rateservice mapanomaly detectionauto-instrumentationmanual instrumentationlog-trace correlationservice mesh tracingcountergaugehistogramlabelspercentilealert rulesthreshold evaluationalert routingalert groupingauto-resolutionaggregationrollupsumaveragetime bucketsdashboardpanelstemplate variablesauto-refreshtime rangePagerDutySlackon-call rotationescalation policyincident lifecycle

Prerequisites

It is recommended to complete the previous tracks before starting this one. Concepts build progressively throughout the curriculum.