Network Time Protocol (NTP)

Derive the NTP offset and round-trip delay formulas from four timestamps and identify the symmetric-delay assumption they rely on.
Explain why synchronization error accumulates across NTP strata and what limits how accurately a stratum-3 server can be synchronized.
Describe what a leap second is and why it can cause problems for software that assumes time always increases monotonically.
Explain why NTP clients query multiple servers and use statistical filtering rather than trusting a single time source.

In order for computers to coordinate actions, they must agree on what time it is. Network Time Protocol (NTP) enables this by giving computers a way to synchronize their clocks over a network with millisecond precision. NTP dates from 1985, but has survived largely unchanged because it works so well.

The challenge of clock synchronization is that network communication takes time. If a server sends you "the current time is 12:00:00", by the time you receive that message, it's no longer 12:00:00, and the amount of delay depends on the state of the network. NTP solves this with an algorithm that measures both the clock offset (how far off your clock is) and the network delay. It uses four timestamps:

label	name	purpose
t1	the client send time	when the client sends the request
t2	the server receive time	when the server receives the request
t3	the server transmit time	when the server sends the response
t4	the client receive time	when the client receives the response

From these four timestamps, NTP calculates:

offset = ((t2 - t1) + (t3 - t4)) / 2
delay = (t4 - t1) - (t3 - t2)

The offset tells you how to adjust your clock, while the delay tells you how reliable this measurement is (lower delays are more accurate).

These formulas rest on an assumption that is worth stating explicitly: the network delay is symmetric, i.e., the time for the request to travel to the server equals the time for the response to travel back. Under this assumption, the client's transmission time is half the total round-trip time, so we split (t4 - t1) evenly between outbound and inbound. If the assumption holds and the server's processing time (t3 - t2) is negligible, then (t2 - t1) ≈ (t4 - t3) and the offset formula gives the true clock difference.

In practice, network paths are often asymmetric— packets may take different routes in each direction. This asymmetry introduces an error bounded by half the difference in one-way delays. On a LAN with sub-millisecond RTT the error is tiny; on a satellite link with an asymmetric path it can be tens of milliseconds. NTP clients mitigate this by collecting multiple samples and discarding outliers, but they cannot eliminate the error entirely without hardware timestamping.

NTP organizes time servers into levels called strata: Stratum 0 is reference clocks such as atomic clocks and GPS receivers. Stratum 1 includes servers directly connected to stratum 0, which act as primary time servers. Stratum 2 is servers that connect with stratum 1 servers, and so on. This hierarchy prevents circular dependencies and allows the system to scale. End-user computers typically sync with stratum 2 or 3 servers.

Implementation

Our simulation starts with the NTP message structure:

@dataclass
class NTPMessage:
    """A simplified NTP message packet."""

    # Client timestamps
    t1: float = 0.0  # Client send time
    t2: float = 0.0  # Server receive time
    t3: float = 0.0  # Server transmit time
    t4: float = 0.0  # Client receive time

    # Stratum level (distance from reference clock)
    stratum: int = 0

    def calculate_offset(self) -> float:
        """Calculate clock offset using NTP algorithm.

        offset = ((t2 - t1) + (t3 - t4)) / 2
        """
        if self.t1 and self.t2 and self.t3 and self.t4:
            return ((self.t2 - self.t1) + (self.t3 - self.t4)) / 2.0
        return 0.0

    def calculate_delay(self) -> float:
        """Calculate round-trip delay.

        delay = (t4 - t1) - (t3 - t2)
        """
        if self.t1 and self.t2 and self.t3 and self.t4:
            return (self.t4 - self.t1) - (self.t3 - self.t2)
        return 0.0

The message holds the four timestamps and includes methods to calculate offset and delay using the NTP formulas.

The NTP server receives requests, records timestamps t2 and t3, and sends responses back to clients. In our simulation, the server's clock is accurate: it uses self.now, which is the simulation's true time. The stratum field indicates this server's level in the time hierarchy:

class NTPServer(Process):
    """An NTP time server that responds to client requests."""

    def init(
        self,
        name: str,
        stratum: int,
        request_queue: Queue,
        network_delay: float = 0.1,
    ):
        self.name = name
        self.stratum = stratum
        self.request_queue = request_queue
        self.network_delay = network_delay
        self.requests_served = 0

    async def run(self):
        """Process incoming NTP requests."""
        while True:
            # Wait for a request
            client_queue, message = await self.request_queue.get()

            # Record server receive time (t2)
            message.t2 = self.now

            # Simulate processing time
            await self.timeout(0.001)

            # Record server transmit time (t3)
            message.t3 = self.now
            message.stratum = self.stratum

            print(
                f"[{self.now:.3f}] {self.name} (stratum {self.stratum}): "
                f"Responding to request (t2={message.t2:.3f}, t3={message.t3:.3f})"
            )

            # Send response back to client with network delay
            await self.timeout(self.network_delay)
            await client_queue.put(message)

            self.requests_served += 1

The NTP client is more complex because it must adjust its own clock. The constructor stores the server queue, sync interval, simulated network delay, and an initial clock offset that represents how far off the client starts:

class NTPClient(Process):
    """An NTP client that synchronizes its clock with a server."""

    def init(
        self,
        name: str,
        server_queue: Queue,
        sync_interval: float,
        network_delay: float = 0.1,
        initial_offset: float = 0.0,
    ):
        self.name = name
        self.server_queue = server_queue
        self.sync_interval = sync_interval
        self.network_delay = network_delay

        # Client's local clock offset from true time
        self.clock_offset = initial_offset
        self.response_queue = Queue(self._env)

        # Statistics
        self.syncs_performed = 0
        self.offset_history = []

    def get_local_time(self) -> float:
        """Get current time according to client's local clock."""
        return self.now + self.clock_offset

The run method simply waits for each sync interval before calling _sync_with_server:

    async def run(self):
        """Periodically sync with NTP server."""
        while True:
            # Wait for sync interval
            await self.timeout(self.sync_interval)

            # Perform NTP sync
            await self._sync_with_server()

_sync_with_server executes one full NTP exchange. It records the send time t1, waits for the server's response containing t2 and t3, records the receive time t4, and then applies the calculated offset:

    async def _sync_with_server(self):
        """Execute one NTP synchronization cycle."""
        # Create request message with client send time (t1)
        message = NTPMessage(t1=self.get_local_time())

        print(
            f"[{self.now:.3f}] {self.name}: Sending sync request "
            f"(local_time={message.t1:.3f}, offset={self.clock_offset:.3f})"
        )

        # Send request with network delay
        await self.timeout(self.network_delay)
        await self.server_queue.put((self.response_queue, message))

        # Wait for response
        response = await self.response_queue.get()

        # Record client receive time (t4)
        response.t4 = self.get_local_time()

        # Calculate offset and delay
        offset = response.calculate_offset()
        delay = response.calculate_delay()

        print(
            f"[{self.now:.3f}] {self.name}: Received response "
            f"(offset={offset:.3f}, delay={delay:.3f})"
        )

        # Adjust clock by the calculated offset
        self.clock_offset -= offset
        self.syncs_performed += 1
        self.offset_history.append(abs(offset))

        print(
            f"[{self.now:.3f}] {self.name}: Clock adjusted, "
            f"new offset from true time: {self.clock_offset:.3f}"
        )

The client maintains a clock_offset representing how far its local clock differs from true time. When it syncs, it calculates the offset using the NTP algorithm and adjusts its clock accordingly. Notice that get_local_time() returns the client's view of time, which may differ from simulation time until synchronization occurs.

Running a Simulation

Let's see clock synchronization in action:

def main():
    """Simulate NTP clock synchronization."""
    env = Environment()

    # Create server queue
    server_queue = Queue(env)

    # Create NTP server (stratum 1 - connected to reference clock)
    server = NTPServer(env, "time.example.com", stratum=1, request_queue=server_queue)

    # Create clients with different initial clock offsets
    client1 = NTPClient(
        env,
        "client1.local",
        server_queue,
        sync_interval=5.0,
        initial_offset=2.5,  # 2.5 seconds fast
    )

    client2 = NTPClient(
        env,
        "client2.local",
        server_queue,
        sync_interval=5.0,
        initial_offset=-1.8,  # 1.8 seconds slow
    )

    client3 = NTPClient(
        env,
        "client3.local",
        server_queue,
        sync_interval=7.0,
        initial_offset=0.5,  # 0.5 seconds fast
    )

    # Run simulation
    env.run(until=25)

    # Print statistics
    print("\n=== NTP Synchronization Statistics ===")
    print(f"Server requests served: {server.requests_served}")

    for client in [client1, client2, client3]:
        print(f"\n{client.name}:")
        print(f"  Syncs performed: {client.syncs_performed}")
        print(f"  Final clock offset: {client.clock_offset:.6f}s")
        if client.offset_history:
            print(
                f"  Average correction: {sum(client.offset_history) / len(client.offset_history):.6f}s"
            )

The output shows clients starting with different clock offsets—some fast, some slow—and gradually converging toward the true time as they sync with the server. After a few iterations, all clients are within milliseconds of true time.

[5.000] client1.local: Sending sync request (local_time=7.500, offset=2.500)
[5.000] client2.local: Sending sync request (local_time=3.200, offset=-1.800)
[5.101] time.example.com (stratum 1): Responding to request (t2=5.100, t3=5.101)
[5.201] client1.local: Received response (offset=-2.500, delay=0.200)
[5.201] client1.local: Clock adjusted, new offset from true time: 5.000
[5.202] time.example.com (stratum 1): Responding to request (t2=5.201, t3=5.202)
[5.302] client2.local: Received response (offset=1.850, delay=0.301)
[5.302] client2.local: Clock adjusted, new offset from true time: -3.651
[7.000] client3.local: Sending sync request (local_time=7.500, offset=0.500)
[7.101] time.example.com (stratum 1): Responding to request (t2=7.100, t3=7.101)
[7.201] client3.local: Received response (offset=-0.500, delay=0.200)
[7.201] client3.local: Clock adjusted, new offset from true time: 1.000
[10.201] client1.local: Sending sync request (local_time=15.201, offset=5.000)
[10.302] client2.local: Sending sync request (local_time=6.651, offset=-3.651)
[10.302] time.example.com (stratum 1): Responding to request (t2=10.301, t3=10.302)
[10.402] client1.local: Received response (offset=-5.000, delay=0.200)
[10.402] client1.local: Clock adjusted, new offset from true time: 10.000
[10.403] time.example.com (stratum 1): Responding to request (t2=10.402, t3=10.403)
[10.503] client2.local: Received response (offset=3.651, delay=0.200)
[10.503] client2.local: Clock adjusted, new offset from true time: -7.301
[14.201] client3.local: Sending sync request (local_time=15.201, offset=1.000)
[14.302] time.example.com (stratum 1): Responding to request (t2=14.301, t3=14.302)
[14.402] client3.local: Received response (offset=-1.000, delay=0.200)
[14.402] client3.local: Clock adjusted, new offset from true time: 2.000
[15.402] client1.local: Sending sync request (local_time=25.402, offset=10.000)
[15.503] client2.local: Sending sync request (local_time=8.202, offset=-7.301)
[15.503] time.example.com (stratum 1): Responding to request (t2=15.502, t3=15.503)
[15.603] client1.local: Received response (offset=-10.000, delay=0.200)
[15.603] client1.local: Clock adjusted, new offset from true time: 20.000
[15.604] time.example.com (stratum 1): Responding to request (t2=15.603, t3=15.604)
[15.704] client2.local: Received response (offset=7.301, delay=0.200)
[15.704] client2.local: Clock adjusted, new offset from true time: -14.602
[20.603] client1.local: Sending sync request (local_time=40.603, offset=20.000)
[20.704] client2.local: Sending sync request (local_time=6.102, offset=-14.602)
[20.704] time.example.com (stratum 1): Responding to request (t2=20.703, t3=20.704)
[20.804] client1.local: Received response (offset=-20.000, delay=0.200)
[20.804] client1.local: Clock adjusted, new offset from true time: 40.000
[20.805] time.example.com (stratum 1): Responding to request (t2=20.804, t3=20.805)
[20.905] client2.local: Received response (offset=14.602, delay=0.200)
[20.905] client2.local: Clock adjusted, new offset from true time: -29.204
[21.402] client3.local: Sending sync request (local_time=23.402, offset=2.000)
[21.503] time.example.com (stratum 1): Responding to request (t2=21.502, t3=21.503)
[21.603] client3.local: Received response (offset=-2.000, delay=0.200)
[21.603] client3.local: Clock adjusted, new offset from true time: 4.000

=== NTP Synchronization Statistics ===
Server requests served: 11

client1.local:
  Syncs performed: 4
  Final clock offset: 40.000000s
  Average correction: 9.375000s

client2.local:
  Syncs performed: 4
  Final clock offset: -29.204000s
  Average correction: 6.851000s

client3.local:
  Syncs performed: 3
  Final clock offset: 4.000000s
  Average correction: 1.166667s

Stratum Hierarchy

In real NTP deployments, servers form a hierarchy. Let's simulate this with a server:

class StratumServerProcess(Process):
    """Server process for a stratum N+1 NTP server."""

    def init(
        self,
        name: str,
        local_queue: Queue,
        stratum: int,
        clock_state: dict,
        network_delay: float = 0.1,
    ):
        self.name = name
        self.local_queue = local_queue
        self.stratum = stratum
        self.clock_state = clock_state  # Shared with client process
        self.network_delay = network_delay

    def get_local_time(self) -> float:
        """Get current time according to local clock."""
        return self.now + self.clock_state["offset"]

    async def run(self):
        """Serve requests from downstream clients."""
        while True:
            client_queue, message = await self.local_queue.get()

            # Record timestamps
            message.t2 = self.get_local_time()
            await self.timeout(0.001)
            message.t3 = self.get_local_time()
            message.stratum = self.stratum

            # Send response
            await self.timeout(self.network_delay)
            await client_queue.put(message)

We also need a client for the stratum simulation:

class StratumClientProcess(Process):
    """Client process for a stratum N+1 NTP server."""

    def init(
        self,
        name: str,
        upstream_queue: Queue,
        stratum: int,
        clock_state: dict,
        sync_interval: float,
        network_delay: float = 0.1,
    ):
        self.name = name
        self.upstream_queue = upstream_queue
        self.stratum = stratum
        self.clock_state = clock_state  # Shared with server process
        self.sync_interval = sync_interval
        self.network_delay = network_delay
        self.response_queue = Queue(self._env)

    def get_local_time(self) -> float:
        """Get current time according to local clock."""
        return self.now + self.clock_state["offset"]

    async def run(self):
        """Sync with upstream server."""
        while True:
            await self.timeout(self.sync_interval)

            # Send request to upstream
            message = NTPMessage(t1=self.get_local_time())
            await self.timeout(self.network_delay)
            await self.upstream_queue.put((self.response_queue, message))

            # Wait for response
            response = await self.response_queue.get()
            response.t4 = self.get_local_time()

            # Adjust clock (updates shared state)
            offset = response.calculate_offset()
            self.clock_state["offset"] -= offset

            print(
                f"[{self.now:.3f}] {self.name} (stratum {self.stratum}): "
                f"Synced with upstream, offset={offset:.3f}"
            )

A stratum N server needs to both sync with stratum N-1 (as a client) and serve stratum N+1 clients (as a server). We implement this with two separate processes that share clock state via a dictionary. The client process syncs with upstream and updates the shared clock offset. The server process reads from the shared clock offset when responding to downstream requests.

def main():
    """Demonstrate NTP stratum hierarchy."""
    env = Environment()

    # Stratum 1: Primary time server
    s1_queue = Queue(env)
    stratum1 = NTPServer(env, "stratum1.time.gov", stratum=1, request_queue=s1_queue)

    # Stratum 2: Secondary servers syncing with stratum 1
    # Each stratum 2 server has both client and server processes
    s2a_queue = Queue(env)
    s2a_clock = {"offset": 0.0}  # Shared clock state
    StratumClientProcess(
        env,
        "stratum2a.org",
        s1_queue,
        stratum=2,
        clock_state=s2a_clock,
        sync_interval=10.0,
    )
    StratumServerProcess(
        env, "stratum2a.org", s2a_queue, stratum=2, clock_state=s2a_clock
    )

    s2b_queue = Queue(env)
    s2b_clock = {"offset": 0.0}  # Shared clock state
    StratumClientProcess(
        env,
        "stratum2b.org",
        s1_queue,
        stratum=2,
        clock_state=s2b_clock,
        sync_interval=10.0,
    )
    StratumServerProcess(
        env, "stratum2b.org", s2b_queue, stratum=2, clock_state=s2b_clock
    )

    # Stratum 3: End clients
    client_a = NTPClient(
        env, "client_a", s2a_queue, sync_interval=5.0, initial_offset=3.0
    )

    client_b = NTPClient(
        env, "client_b", s2b_queue, sync_interval=5.0, initial_offset=-2.0
    )

    # Run simulation
    env.run(until=35)

    print("\n=== Stratum Hierarchy Results ===")
    print(f"Stratum 1 server requests: {stratum1.requests_served}")
    print(f"\nStratum 2a clock offset: {s2a_clock['offset']:.6f}s")
    print(f"Stratum 2b clock offset: {s2b_clock['offset']:.6f}s")
    print(f"\nClient A final offset: {client_a.clock_offset:.6f}s")
    print(f"Client B final offset: {client_b.clock_offset:.6f}s")

In this simulation, stratum 1 servers sync with reference clocks (simulated as perfect time), stratum 2 servers sync with stratum 1, and end clients sync with stratum 2. Synchronization error accumulates as you go down the hierarchy, but it's still accurate enough for most purposes. A stratum 3 client might be accurate to within a few milliseconds, which is perfectly adequate for log timestamps or cache expiration.

Why does error accumulate? At each stratum, the server applies the offset it calculated from its upstream. That calculation already has some error (because the upstream is not perfectly accurate and the network delay is not perfectly symmetric). When the downstream server syncs with this slightly-off clock, its own offset calculation adds another error on top. The errors are not simply additive—they depend on network jitter and path asymmetry— but in practice each additional stratum adds roughly 1–5 ms of error in a well-run network. Stratum 2 servers are accurate to around 1–10 ms; stratum 3 to around 5–20 ms. This is why NTP stops at stratum 15 and treats stratum 16 as "unsynchronized".

Exercises

Run the basic simulation with network delay doubled. How many sync cycles does each client need to converge within 0.01 of true time? Now try halving the delay. What does this tell you about the relationship between network latency and sync accuracy?
The offset formula assumes symmetric delay. Modify the client to use a one-way delay (simulating asymmetric routing) by passing different outbound and inbound delays. Specifically, set outbound delay to 0.1 and inbound delay to 0.5. What offset does the client calculate? What is the true offset? By how much does asymmetry mislead the client? (Starter: add inbound_delay and outbound_delay parameters to NtpClient.)
A client that syncs once per interval will drift between syncs because real hardware clocks are not perfectly accurate. Add a drift_rate parameter to the client that adds a small error per time unit (e.g., 0.001 per unit). How does drift affect the client's accuracy between syncs? What sync interval keeps the client within 0.05 of true time given drift 0.001?
In the stratum hierarchy simulation, what happens if the stratum-1 server is restarted with a large initial offset (say, +10)? Trace through how the error propagates to stratum-2 and then to the end clients. After how many sync cycles do the end clients recover?
Real NTP clients query multiple servers and use a "best" algorithm to reject outliers before computing the offset. Simulate a misbehaving server that always returns an offset 5.0 higher than true time. Add a second honest server. Implement a simple two-server client that takes the offset from whichever server has the lower delay. Does this correctly identify and ignore the misbehaving server?