TASK
Implementation
A trace collector receives spans from many services, groups them by trace ID, and stores the assembled traces. It also applies sampling to reduce storage volume and gracefully handles spans that arrive after a trace was already closed.
Implement a node that manages span collection and trace assembly:
// Spans from two services assembled into one trace
{ "type": "span", "trace_id": "t1", "span_id": "s1", "service": "A" }
{ "type": "span", "trace_id": "t1", "span_id": "s2", "service": "B",
"parent_span_id": "s1" }
-> { "type": "trace_complete", "trace_id": "t1", "span_count": 2 }
// 1% sampling: reject most traces
{ "type": "span", "trace_id": "t2", "service": "fast" }
(sampling_rate: 0.01)
-> { "type": "span_accepted",
"sampled": false, "reason": "Trace not sampled (1% rate)" }
// Query traces by service
{ "type": "query_traces", "msg_id": 1,
"service": "service-a", "time_range": "1h" }
-> { "type": "query_results", "in_reply_to": 1,
"traces": 125, "avg_duration_ms": 45 }Late spans for a completed trace return "action": "update_trace" rather than being dropped.
Sample Test Cases
Aggregate spans into tracesTimeout: 5000ms
Input
{"src":"service_a","dest":"collector","body":{"type":"span","trace_id":"t1","span_id":"s1","service":"A"}}
{"src":"service_b","dest":"collector","body":{"type":"span","trace_id":"t1","span_id":"s2","service":"B","parent_span_id":"s1"}}
Expected Output
{"type": "trace_complete", "trace_id": "t1", "span_count": 2}Trace samplingTimeout: 5000ms
Input
{
"src": "service",
"dest": "collector",
"body": {
"type": "span",
"trace_id": "t2",
"span_id": "s3",
"service": "fast"
},
"sampling_rate": 0.01
}Expected Output
{"type": "span_accepted", "sampled": false, "reason": "Trace not sampled (1% rate)"}Hints
Hint 1▾
Group spans by trace_id; emit trace_complete when a trace has received all its spans
Hint 2▾
Sampling: hash(trace_id) % 100 < (sampling_rate * 100) to decide consistently per trace
Hint 3▾
query_traces filters by service and returns span count and average duration
Hint 4▾
Late spans for an already-completed trace should update it rather than be dropped
Hint 5▾
span_count increments with each new span for the same trace_id
OVERVIEW
Theoretical Hub
Concept overview coming soon
Key Concepts
trace collectorspan aggregationtrace samplinglate spanstrace queries
main.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
#!/usr/bin/env python3
import sys
import json
def main():
# Your implementation here
for line in sys.stdin:
msg = json.loads(line)
print(json.dumps(msg), flush=True)
if __name__ == "__main__":
main()