Network Time Protocol (NTP)
- Derive the NTP offset and round-trip delay formulas from four timestamps and identify the symmetric-delay assumption they rely on.
- Explain why synchronization error accumulates across NTP strata and what limits how accurately a stratum-3 server can be synchronized.
- Describe what a leap second is and why it can cause problems for software that assumes time always increases monotonically.
- Explain why NTP clients query multiple servers and use statistical filtering rather than trusting a single time source.
In order for computers to coordinate actions, they must agree on what time it is. Network Time Protocol (NTP) enables this by giving computers a way to synchronize their clocks over a network with millisecond precision. NTP dates from 1985, but has survived largely unchanged because it works so well.
The challenge of clock synchronization is that network communication takes time. If a server sends you "the current time is 12:00:00", by the time you receive that message, it's no longer 12:00:00, and the amount of delay depends on the state of the network. NTP solves this with an algorithm that measures both the clock offset (how far off your clock is) and the network delay. It uses four timestamps:
| label | name | purpose |
|---|---|---|
| t1 | the client send time | when the client sends the request |
| t2 | the server receive time | when the server receives the request |
| t3 | the server transmit time | when the server sends the response |
| t4 | the client receive time | when the client receives the response |
From these four timestamps, NTP calculates:
offset = ((t2 - t1) + (t3 - t4)) / 2
delay = (t4 - t1) - (t3 - t2)
The offset tells you how to adjust your clock, while the delay tells you how reliable this measurement is (lower delays are more accurate).
These formulas rest on an assumption that is worth stating explicitly:
the network delay is symmetric, i.e., the time for the request to travel to the server
equals the time for the response to travel back.
Under this assumption, the client's transmission time is half the total round-trip time,
so we split (t4 - t1) evenly between outbound and inbound.
If the assumption holds and the server's processing time (t3 - t2) is negligible,
then (t2 - t1) ≈ (t4 - t3) and the offset formula gives the true clock difference.
In practice, network paths are often asymmetric— packets may take different routes in each direction. This asymmetry introduces an error bounded by half the difference in one-way delays. On a LAN with sub-millisecond RTT the error is tiny; on a satellite link with an asymmetric path it can be tens of milliseconds. NTP clients mitigate this by collecting multiple samples and discarding outliers, but they cannot eliminate the error entirely without hardware timestamping.
NTP organizes time servers into levels called strata: Stratum 0 is reference clocks such as atomic clocks and GPS receivers. Stratum 1 includes servers directly connected to stratum 0, which act as primary time servers. Stratum 2 is servers that connect with stratum 1 servers, and so on. This hierarchy prevents circular dependencies and allows the system to scale. End-user computers typically sync with stratum 2 or 3 servers.
Implementation
Our simulation starts with the NTP message structure:
@dataclass
class NTPMessage:
"""A simplified NTP message packet."""
# Client timestamps
t1: float = 0.0 # Client send time
t2: float = 0.0 # Server receive time
t3: float = 0.0 # Server transmit time
t4: float = 0.0 # Client receive time
# Stratum level (distance from reference clock)
stratum: int = 0
def calculate_offset(self) -> float:
"""Calculate clock offset using NTP algorithm.
offset = ((t2 - t1) + (t3 - t4)) / 2
"""
if self.t1 and self.t2 and self.t3 and self.t4:
return ((self.t2 - self.t1) + (self.t3 - self.t4)) / 2.0
return 0.0
def calculate_delay(self) -> float:
"""Calculate round-trip delay.
delay = (t4 - t1) - (t3 - t2)
"""
if self.t1 and self.t2 and self.t3 and self.t4:
return (self.t4 - self.t1) - (self.t3 - self.t2)
return 0.0
The message holds the four timestamps and includes methods to calculate offset and delay using the NTP formulas.
The NTP server receives requests, records timestamps t2 and t3, and sends responses back to clients.
In our simulation, the server's clock is accurate:
it uses self.now, which is the simulation's true time.
The stratum field indicates this server's level in the time hierarchy:
class NTPServer(Process):
"""An NTP time server that responds to client requests."""
def init(
self,
name: str,
stratum: int,
request_queue: Queue,
network_delay: float = 0.1,
):
self.name = name
self.stratum = stratum
self.request_queue = request_queue
self.network_delay = network_delay
self.requests_served = 0
async def run(self):
"""Process incoming NTP requests."""
while True:
# Wait for a request
client_queue, message = await self.request_queue.get()
# Record server receive time (t2)
message.t2 = self.now
# Simulate processing time
await self.timeout(0.001)
# Record server transmit time (t3)
message.t3 = self.now
message.stratum = self.stratum
print(
f"[{self.now:.3f}] {self.name} (stratum {self.stratum}): "
f"Responding to request (t2={message.t2:.3f}, t3={message.t3:.3f})"
)
# Send response back to client with network delay
await self.timeout(self.network_delay)
await client_queue.put(message)
self.requests_served += 1
The NTP client is more complex because it must adjust its own clock. The constructor stores the server queue, sync interval, simulated network delay, and an initial clock offset that represents how far off the client starts:
class NTPClient(Process):
"""An NTP client that synchronizes its clock with a server."""
def init(
self,
name: str,
server_queue: Queue,
sync_interval: float,
network_delay: float = 0.1,
initial_offset: float = 0.0,
):
self.name = name
self.server_queue = server_queue
self.sync_interval = sync_interval
self.network_delay = network_delay
# Client's local clock offset from true time
self.clock_offset = initial_offset
self.response_queue = Queue(self._env)
# Statistics
self.syncs_performed = 0
self.offset_history = []
def get_local_time(self) -> float:
"""Get current time according to client's local clock."""
return self.now + self.clock_offset
The run method simply waits for each sync interval before calling _sync_with_server:
async def run(self):
"""Periodically sync with NTP server."""
while True:
# Wait for sync interval
await self.timeout(self.sync_interval)
# Perform NTP sync
await self._sync_with_server()
_sync_with_server executes one full NTP exchange.
It records the send time t1,
waits for the server's response containing t2 and t3,
records the receive time t4,
and then applies the calculated offset:
async def _sync_with_server(self):
"""Execute one NTP synchronization cycle."""
# Create request message with client send time (t1)
message = NTPMessage(t1=self.get_local_time())
print(
f"[{self.now:.3f}] {self.name}: Sending sync request "
f"(local_time={message.t1:.3f}, offset={self.clock_offset:.3f})"
)
# Send request with network delay
await self.timeout(self.network_delay)
await self.server_queue.put((self.response_queue, message))
# Wait for response
response = await self.response_queue.get()
# Record client receive time (t4)
response.t4 = self.get_local_time()
# Calculate offset and delay
offset = response.calculate_offset()
delay = response.calculate_delay()
print(
f"[{self.now:.3f}] {self.name}: Received response "
f"(offset={offset:.3f}, delay={delay:.3f})"
)
# Adjust clock by the calculated offset
self.clock_offset -= offset
self.syncs_performed += 1
self.offset_history.append(abs(offset))
print(
f"[{self.now:.3f}] {self.name}: Clock adjusted, "
f"new offset from true time: {self.clock_offset:.3f}"
)
The client maintains a clock_offset representing how far its local clock differs from true time.
When it syncs,
it calculates the offset using the NTP algorithm and adjusts its clock accordingly.
Notice that get_local_time() returns the client's view of time,
which may differ from simulation time until synchronization occurs.
Running a Simulation
Let's see clock synchronization in action:
def main():
"""Simulate NTP clock synchronization."""
env = Environment()
# Create server queue
server_queue = Queue(env)
# Create NTP server (stratum 1 - connected to reference clock)
server = NTPServer(env, "time.example.com", stratum=1, request_queue=server_queue)
# Create clients with different initial clock offsets
client1 = NTPClient(
env,
"client1.local",
server_queue,
sync_interval=5.0,
initial_offset=2.5, # 2.5 seconds fast
)
client2 = NTPClient(
env,
"client2.local",
server_queue,
sync_interval=5.0,
initial_offset=-1.8, # 1.8 seconds slow
)
client3 = NTPClient(
env,
"client3.local",
server_queue,
sync_interval=7.0,
initial_offset=0.5, # 0.5 seconds fast
)
# Run simulation
env.run(until=25)
# Print statistics
print("\n=== NTP Synchronization Statistics ===")
print(f"Server requests served: {server.requests_served}")
for client in [client1, client2, client3]:
print(f"\n{client.name}:")
print(f" Syncs performed: {client.syncs_performed}")
print(f" Final clock offset: {client.clock_offset:.6f}s")
if client.offset_history:
print(
f" Average correction: {sum(client.offset_history) / len(client.offset_history):.6f}s"
)
The output shows clients starting with different clock offsets—some fast, some slow—and gradually converging toward the true time as they sync with the server. After a few iterations, all clients are within milliseconds of true time.
[5.000] client1.local: Sending sync request (local_time=7.500, offset=2.500)
[5.000] client2.local: Sending sync request (local_time=3.200, offset=-1.800)
[5.101] time.example.com (stratum 1): Responding to request (t2=5.100, t3=5.101)
[5.201] client1.local: Received response (offset=-2.500, delay=0.200)
[5.201] client1.local: Clock adjusted, new offset from true time: 5.000
[5.202] time.example.com (stratum 1): Responding to request (t2=5.201, t3=5.202)
[5.302] client2.local: Received response (offset=1.850, delay=0.301)
[5.302] client2.local: Clock adjusted, new offset from true time: -3.651
[7.000] client3.local: Sending sync request (local_time=7.500, offset=0.500)
[7.101] time.example.com (stratum 1): Responding to request (t2=7.100, t3=7.101)
[7.201] client3.local: Received response (offset=-0.500, delay=0.200)
[7.201] client3.local: Clock adjusted, new offset from true time: 1.000
[10.201] client1.local: Sending sync request (local_time=15.201, offset=5.000)
[10.302] client2.local: Sending sync request (local_time=6.651, offset=-3.651)
[10.302] time.example.com (stratum 1): Responding to request (t2=10.301, t3=10.302)
[10.402] client1.local: Received response (offset=-5.000, delay=0.200)
[10.402] client1.local: Clock adjusted, new offset from true time: 10.000
[10.403] time.example.com (stratum 1): Responding to request (t2=10.402, t3=10.403)
[10.503] client2.local: Received response (offset=3.651, delay=0.200)
[10.503] client2.local: Clock adjusted, new offset from true time: -7.301
[14.201] client3.local: Sending sync request (local_time=15.201, offset=1.000)
[14.302] time.example.com (stratum 1): Responding to request (t2=14.301, t3=14.302)
[14.402] client3.local: Received response (offset=-1.000, delay=0.200)
[14.402] client3.local: Clock adjusted, new offset from true time: 2.000
[15.402] client1.local: Sending sync request (local_time=25.402, offset=10.000)
[15.503] client2.local: Sending sync request (local_time=8.202, offset=-7.301)
[15.503] time.example.com (stratum 1): Responding to request (t2=15.502, t3=15.503)
[15.603] client1.local: Received response (offset=-10.000, delay=0.200)
[15.603] client1.local: Clock adjusted, new offset from true time: 20.000
[15.604] time.example.com (stratum 1): Responding to request (t2=15.603, t3=15.604)
[15.704] client2.local: Received response (offset=7.301, delay=0.200)
[15.704] client2.local: Clock adjusted, new offset from true time: -14.602
[20.603] client1.local: Sending sync request (local_time=40.603, offset=20.000)
[20.704] client2.local: Sending sync request (local_time=6.102, offset=-14.602)
[20.704] time.example.com (stratum 1): Responding to request (t2=20.703, t3=20.704)
[20.804] client1.local: Received response (offset=-20.000, delay=0.200)
[20.804] client1.local: Clock adjusted, new offset from true time: 40.000
[20.805] time.example.com (stratum 1): Responding to request (t2=20.804, t3=20.805)
[20.905] client2.local: Received response (offset=14.602, delay=0.200)
[20.905] client2.local: Clock adjusted, new offset from true time: -29.204
[21.402] client3.local: Sending sync request (local_time=23.402, offset=2.000)
[21.503] time.example.com (stratum 1): Responding to request (t2=21.502, t3=21.503)
[21.603] client3.local: Received response (offset=-2.000, delay=0.200)
[21.603] client3.local: Clock adjusted, new offset from true time: 4.000
=== NTP Synchronization Statistics ===
Server requests served: 11
client1.local:
Syncs performed: 4
Final clock offset: 40.000000s
Average correction: 9.375000s
client2.local:
Syncs performed: 4
Final clock offset: -29.204000s
Average correction: 6.851000s
client3.local:
Syncs performed: 3
Final clock offset: 4.000000s
Average correction: 1.166667s
Stratum Hierarchy
In real NTP deployments, servers form a hierarchy. Let's simulate this with a server:
class StratumServerProcess(Process):
"""Server process for a stratum N+1 NTP server."""
def init(
self,
name: str,
local_queue: Queue,
stratum: int,
clock_state: dict,
network_delay: float = 0.1,
):
self.name = name
self.local_queue = local_queue
self.stratum = stratum
self.clock_state = clock_state # Shared with client process
self.network_delay = network_delay
def get_local_time(self) -> float:
"""Get current time according to local clock."""
return self.now + self.clock_state["offset"]
async def run(self):
"""Serve requests from downstream clients."""
while True:
client_queue, message = await self.local_queue.get()
# Record timestamps
message.t2 = self.get_local_time()
await self.timeout(0.001)
message.t3 = self.get_local_time()
message.stratum = self.stratum
# Send response
await self.timeout(self.network_delay)
await client_queue.put(message)
We also need a client for the stratum simulation:
class StratumClientProcess(Process):
"""Client process for a stratum N+1 NTP server."""
def init(
self,
name: str,
upstream_queue: Queue,
stratum: int,
clock_state: dict,
sync_interval: float,
network_delay: float = 0.1,
):
self.name = name
self.upstream_queue = upstream_queue
self.stratum = stratum
self.clock_state = clock_state # Shared with server process
self.sync_interval = sync_interval
self.network_delay = network_delay
self.response_queue = Queue(self._env)
def get_local_time(self) -> float:
"""Get current time according to local clock."""
return self.now + self.clock_state["offset"]
async def run(self):
"""Sync with upstream server."""
while True:
await self.timeout(self.sync_interval)
# Send request to upstream
message = NTPMessage(t1=self.get_local_time())
await self.timeout(self.network_delay)
await self.upstream_queue.put((self.response_queue, message))
# Wait for response
response = await self.response_queue.get()
response.t4 = self.get_local_time()
# Adjust clock (updates shared state)
offset = response.calculate_offset()
self.clock_state["offset"] -= offset
print(
f"[{self.now:.3f}] {self.name} (stratum {self.stratum}): "
f"Synced with upstream, offset={offset:.3f}"
)
A stratum N server needs to both sync with stratum N-1 (as a client) and serve stratum N+1 clients (as a server). We implement this with two separate processes that share clock state via a dictionary. The client process syncs with upstream and updates the shared clock offset. The server process reads from the shared clock offset when responding to downstream requests.
def main():
"""Demonstrate NTP stratum hierarchy."""
env = Environment()
# Stratum 1: Primary time server
s1_queue = Queue(env)
stratum1 = NTPServer(env, "stratum1.time.gov", stratum=1, request_queue=s1_queue)
# Stratum 2: Secondary servers syncing with stratum 1
# Each stratum 2 server has both client and server processes
s2a_queue = Queue(env)
s2a_clock = {"offset": 0.0} # Shared clock state
StratumClientProcess(
env,
"stratum2a.org",
s1_queue,
stratum=2,
clock_state=s2a_clock,
sync_interval=10.0,
)
StratumServerProcess(
env, "stratum2a.org", s2a_queue, stratum=2, clock_state=s2a_clock
)
s2b_queue = Queue(env)
s2b_clock = {"offset": 0.0} # Shared clock state
StratumClientProcess(
env,
"stratum2b.org",
s1_queue,
stratum=2,
clock_state=s2b_clock,
sync_interval=10.0,
)
StratumServerProcess(
env, "stratum2b.org", s2b_queue, stratum=2, clock_state=s2b_clock
)
# Stratum 3: End clients
client_a = NTPClient(
env, "client_a", s2a_queue, sync_interval=5.0, initial_offset=3.0
)
client_b = NTPClient(
env, "client_b", s2b_queue, sync_interval=5.0, initial_offset=-2.0
)
# Run simulation
env.run(until=35)
print("\n=== Stratum Hierarchy Results ===")
print(f"Stratum 1 server requests: {stratum1.requests_served}")
print(f"\nStratum 2a clock offset: {s2a_clock['offset']:.6f}s")
print(f"Stratum 2b clock offset: {s2b_clock['offset']:.6f}s")
print(f"\nClient A final offset: {client_a.clock_offset:.6f}s")
print(f"Client B final offset: {client_b.clock_offset:.6f}s")
In this simulation, stratum 1 servers sync with reference clocks (simulated as perfect time), stratum 2 servers sync with stratum 1, and end clients sync with stratum 2. Synchronization error accumulates as you go down the hierarchy, but it's still accurate enough for most purposes. A stratum 3 client might be accurate to within a few milliseconds, which is perfectly adequate for log timestamps or cache expiration.
Why does error accumulate? At each stratum, the server applies the offset it calculated from its upstream. That calculation already has some error (because the upstream is not perfectly accurate and the network delay is not perfectly symmetric). When the downstream server syncs with this slightly-off clock, its own offset calculation adds another error on top. The errors are not simply additive—they depend on network jitter and path asymmetry— but in practice each additional stratum adds roughly 1–5 ms of error in a well-run network. Stratum 2 servers are accurate to around 1–10 ms; stratum 3 to around 5–20 ms. This is why NTP stops at stratum 15 and treats stratum 16 as "unsynchronized".
Exercises
-
Run the basic simulation with network delay doubled. How many sync cycles does each client need to converge within 0.01 of true time? Now try halving the delay. What does this tell you about the relationship between network latency and sync accuracy?
-
The offset formula assumes symmetric delay. Modify the client to use a one-way delay (simulating asymmetric routing) by passing different outbound and inbound delays. Specifically, set outbound delay to 0.1 and inbound delay to 0.5. What offset does the client calculate? What is the true offset? By how much does asymmetry mislead the client? (Starter: add
inbound_delayandoutbound_delayparameters toNtpClient.) -
A client that syncs once per interval will drift between syncs because real hardware clocks are not perfectly accurate. Add a
drift_rateparameter to the client that adds a small error per time unit (e.g., 0.001 per unit). How does drift affect the client's accuracy between syncs? What sync interval keeps the client within 0.05 of true time given drift 0.001? -
In the stratum hierarchy simulation, what happens if the stratum-1 server is restarted with a large initial offset (say, +10)? Trace through how the error propagates to stratum-2 and then to the end clients. After how many sync cycles do the end clients recover?
-
Real NTP clients query multiple servers and use a "best" algorithm to reject outliers before computing the offset. Simulate a misbehaving server that always returns an offset 5.0 higher than true time. Add a second honest server. Implement a simple two-server client that takes the offset from whichever server has the lower delay. Does this correctly identify and ignore the misbehaving server?