ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/identifier/tasks/task-2-2-random-salt
TASK

Implementation

Your basic ID generator might produce duplicates if called multiple times within the same millisecond. Add a random component or sequence counter to ensure uniqueness even under high load.

Options for enhanced uniqueness:

  1. Add a sequence counter that resets each millisecond
  2. Include random bytes in each ID
  3. Use a structure similar to Twitter Snowflake IDs

Your IDs must be unique across all nodes and all time, even if generate is called millions of times per second.

Sample Test Cases

Generate single IDTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"generate","msg_id":2}}
Expected Output
{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}
{"src":"n1","dest":"c1","body":{"type":"generate_ok","in_reply_to":2,"msg_id":1,"id":"n1-0"}}

Hints

Hint 1
Add a random component to each generated ID
Hint 2
Use a sequence counter for IDs generated in the same millisecond
Hint 3
Consider combining timestamp, node_id, and random value
OVERVIEW

Theoretical Hub

The Same-Millisecond Problem

High-throughput systems can generate thousands of IDs per millisecond. Timestamp alone is not granular enough. You need an additional component.

Option 1: Sequence Counter

class IDGenerator:
    last_timestamp = 0
    sequence = 0
    
    def generate(self):
        timestamp = current_time_ms()
        
        if timestamp == self.last_timestamp:
            self.sequence += 1
        else:
            self.sequence = 0
            self.last_timestamp = timestamp
        
        return f"{node_id}-{timestamp}-{sequence}"

Option 2: Random Salt

def generate():
    timestamp = current_time_ms()
    salt = random_bytes(4)  # 4 billion combinations
    return f"{node_id}-{timestamp}-{salt}"

Snowflake IDs

Twitter invented Snowflake IDs for exactly this problem. A 64-bit Snowflake ID contains:

  • 41 bits - timestamp (69 years)
  • 10 bits - machine ID (1024 machines)
  • 12 bits - sequence number (4096 IDs per millisecond)

This allows 4096 unique IDs per millisecond per machine.

Key Concepts

randomnesscollision preventionUUID
main.py
python
Add Random Salt to Prevent Collisions - The Identifier | Build Distributed Systems