Building TCP on UDP
- Explain how sequence numbers, acknowledgments, and retransmission together guarantee reliable delivery over an unreliable network.
- Describe how a sliding window allows multiple segments to be in flight simultaneously and why this improves throughput.
- Explain the AIMD (additive increase, multiplicative decrease) algorithm and why it causes multiple flows sharing a link to converge to fair shares.
- Distinguish between the congestion window and the receive window, and explain what each one limits.
When you open a website, stream a video, or send an email, you're almost certainly using Transmission Control Protocol (TCP). TCP provides reliable, ordered delivery of data over the internet, handling packet loss, reordering, and congestion automatically. But TCP itself runs on top of UDP, which only provides best-effort, unordered packet delivery with no guarantees.
Building a simplified TCP on top of UDP demonstrates the core mechanisms that make reliable communication possible over unreliable networks:
-
Sequence numbers: Each byte has a sequence number. Receivers use these to detect missing data and reorder packets.
-
Acknowledgments (ACKs): Receivers send ACKs indicating what data they've received. Senders use ACKs to know what to retransmit.
-
Retransmission: If an ACK doesn't arrive within a timeout, the sender retransmits the data.
-
Cumulative ACKs: ACKs indicate that the receiver has received everything up to a particular message, which simplifies acknowledgment logic.
-
Sliding windows: Sender can have multiple unacknowledged packets in flight to maintain throughput despite round-trip latency.
Our TCP-over-UDP implementation consists of:
tcp_types.py: core data structures.unreliable_network.py: simulates packet loss, reordering, duplication.tcp_connection.py: TCP connection with reliability mechanisms.tcp_applications.py: client and server applications
Let's examine each component in turn.
Data Structures
We start by defining the six types of packets our protocol will use:
class PacketType(Enum):
"""Types of TCP packets."""
SYN = "SYN" # Synchronize (connection establishment)
SYN_ACK = "SYN_ACK" # Synchronize-Acknowledge
ACK = "ACK" # Acknowledge
DATA = "DATA" # Data packet
FIN = "FIN" # Finish (connection teardown)
FIN_ACK = "FIN_ACK" # Finish-Acknowledge
and the seven different states that a connection can be in:
class ConnectionState(Enum):
"""TCP connection states."""
CLOSED = "CLOSED"
SYN_SENT = "SYN_SENT"
SYN_RECEIVED = "SYN_RECEIVED"
ESTABLISHED = "ESTABLISHED"
FIN_WAIT = "FIN_WAIT"
CLOSE_WAIT = "CLOSE_WAIT"
CLOSING = "CLOSING"
The Packet structure mirrors a real TCP/IP packet header with source and destination addressing,
sequence and acknowledgment numbers,
packet type,
and payload data:
@dataclass
class Packet:
"""A network packet (simulating IP + TCP)."""
src_addr: str
src_port: int
dst_addr: str | None
dst_port: int | None
seq_num: int
ack_num: int
packet_type: PacketType
data: bytes = b""
window_size: int = 65535
Unreliable Network Layer
Before building TCP, we need to simulate UDP's unreliable delivery.
UnreliableNetwork tracks per-packet statistics
and maintains a registry mapping address:port pairs to their receive queues:
class UnreliableNetwork(Process):
"""Simulates unreliable packet delivery (like UDP)."""
def init(
self,
loss_rate: float = 0.1,
reorder_rate: float = 0.05,
duplicate_rate: float = 0.02,
delay_range: tuple = (0.1, 0.5),
) -> None:
self.loss_rate = loss_rate
self.reorder_rate = reorder_rate
self.duplicate_rate = duplicate_rate
self.delay_range = delay_range
# Network endpoints (address:port -> queue)
self.endpoints: Dict[str, Queue] = {}
# Statistics
self.packets_sent = 0
self.packets_lost = 0
self.packets_reordered = 0
self.packets_duplicated = 0
print(
f"[{self.now:.1f}] Network: Started (loss={loss_rate:.0%}, "
f"reorder={reorder_rate:.0%})"
)
async def run(self) -> None:
"""Network doesn't have a main loop - just routes packets."""
while True:
await self.timeout(1000) # Sleep forever
def register_endpoint(self, address: str, port: int, queue: Queue) -> None:
"""Register an endpoint to receive packets."""
endpoint_id = f"{address}:{port}"
self.endpoints[endpoint_id] = queue
print(f"[{self.now:.1f}] Network: Registered {endpoint_id}")
send_packet applies the configured loss and duplication rates
before calling _deliver_packet,
which adds a random delay and optionally increases it to simulate reordering:
async def send_packet(self, packet: Packet) -> None:
"""Send packet with simulated unreliability."""
self.packets_sent += 1
# Simulate packet loss
if random.random() < self.loss_rate:
self.packets_lost += 1
print(f"[{self.now:.1f}] Network: LOST {packet}")
return
# Simulate packet duplication
if random.random() < self.duplicate_rate:
self.packets_duplicated += 1
print(f"[{self.now:.1f}] Network: DUPLICATING {packet}")
await self._deliver_packet(packet)
# Deliver the packet
await self._deliver_packet(packet)
async def _deliver_packet(self, packet: Packet) -> None:
"""Deliver packet to destination."""
# Simulate network delay
delay = random.uniform(*self.delay_range)
# Simulate reordering by adding extra random delay
if random.random() < self.reorder_rate:
self.packets_reordered += 1
delay += random.uniform(0.2, 0.8)
await self.timeout(delay)
# Find destination endpoint
endpoint_id = f"{packet.dst_addr}:{packet.dst_port}"
if endpoint_id in self.endpoints:
await self.endpoints[endpoint_id].put(packet)
else:
print(f"[{self.now:.1f}] Network: No endpoint for {endpoint_id}")
This simulates the way packets in real networks can be lost, delayed, reordered, or duplicated. The network maintains a registry of endpoints and routes packets accordingly.
TCP Connection
The TCP connection implements reliability over the unreliable network:
class TCPConnection(Process):
"""TCP connection with reliability over unreliable network."""
def init(
self,
local_addr: str,
local_port: int,
network: "UnreliableNetwork",
mtu: int = 1400,
window_size: int = 4,
timeout: float = 2.0,
) -> None:
self.local_addr = local_addr
self.local_port = local_port
self.network = network
self.mtu = mtu # Maximum transmission unit
self.window_size = window_size # Sliding window size
self.rto = timeout # Retransmission timeout
# Connection state
self.state = ConnectionState.CLOSED
self.remote_addr: str | None = None
self.remote_port: int | None = None
# Sequence numbers
self.send_seq = random.randint(1000, 9999) # Initial sequence number
self.recv_seq = 0
# Send buffer
self.send_buffer: list[SegmentBuffer] = []
self.send_base = self.send_seq # Oldest unacknowledged sequence
self.next_seq_num = self.send_seq # Next sequence to use
# Receive buffer
self.recv_buffer = ReceiveBuffer()
# Queues
self.recv_queue: Queue = Queue(self._env) # Incoming packets
self.data_ready: Queue = Queue(self._env) # Data for application
# Register with network
network.register_endpoint(local_addr, local_port, self.recv_queue)
# Statistics
self.bytes_sent = 0
self.bytes_received = 0
self.packets_retransmitted = 0
print(
f"[{self.now:.1f}] TCP {self.local_addr}:{self.local_port}: "
f"Created (ISN={self.send_seq})"
)
The connection maintains send and receive buffers. The send buffer holds unacknowledged segments for potential retransmission. The receive buffer handles out-of-order delivery by holding segments until gaps are filled.
The run loop reads packets from the receive queue and dispatches each one through handle_packet,
which routes SYN, SYN-ACK, ACK, and DATA packets to the appropriate handler:
async def run(self) -> None:
"""Main TCP loop: handle incoming packets."""
while True:
packet = await self.recv_queue.get()
await self.handle_packet(packet)
async def handle_packet(self, packet: Packet) -> None:
"""Process incoming packet based on type."""
if packet.packet_type == PacketType.SYN:
await self.handle_syn(packet)
elif packet.packet_type == PacketType.SYN_ACK:
print(
f"[{self.now:.1f}] TCP: Received SYN-ACK "
f"(seq={packet.seq_num}, ack={packet.ack_num})"
)
self.recv_seq = packet.seq_num + 1
self.recv_buffer.next_expected_seq = self.recv_seq
self.send_base = packet.ack_num
# Send final ACK
ack = Packet(
src_addr=self.local_addr,
dst_addr=packet.src_addr,
src_port=self.local_port,
dst_port=packet.src_port,
seq_num=self.send_base,
ack_num=self.recv_seq,
packet_type=PacketType.ACK,
)
await self.network.send_packet(ack)
self.state = ConnectionState.ESTABLISHED
print(f"[{self.now:.1f}] TCP: Sent ACK, connection established")
elif packet.packet_type == PacketType.ACK:
if self.state == ConnectionState.SYN_RECEIVED:
print(
f"[{self.now:.1f}] TCP: Received final ACK, connection established"
)
self.state = ConnectionState.ESTABLISHED
else:
await self.handle_ack(packet)
elif packet.packet_type == PacketType.DATA:
await self.handle_data(packet)
Three-Way Handshake
TCP connection establishment uses three packets to synchronize state:
async def connect(self, remote_addr: str, remote_port: int) -> bool:
"""Initiate TCP connection (3-way handshake)."""
self.remote_addr = remote_addr
self.remote_port = remote_port
print(
f"\n[{self.now:.1f}] TCP: Starting connection to "
f"{remote_addr}:{remote_port}"
)
# Send SYN
self.state = ConnectionState.SYN_SENT
syn = Packet(
src_addr=self.local_addr,
dst_addr=remote_addr,
src_port=self.local_port,
dst_port=remote_port,
seq_num=self.send_seq,
ack_num=0,
packet_type=PacketType.SYN,
)
await self.network.send_packet(syn)
print(f"[{self.now:.1f}] TCP: Sent SYN (seq={self.send_seq})")
# Wait for SYN-ACK with timeout
timeout_time = self.now + 5.0
while self.state != ConnectionState.ESTABLISHED and self.now < timeout_time:
await self.timeout(0.1)
if self.state == ConnectionState.ESTABLISHED:
print(f"[{self.now:.1f}] TCP: Connection ESTABLISHED\n")
return True
else:
print(f"[{self.now:.1f}] TCP: Connection FAILED (timeout)\n")
self.state = ConnectionState.CLOSED
return False
The handshake sequence is:
- Client → Server: SYN (seq=x)
- Server → Client: SYN-ACK (seq=y, ack=x+1)
- Client → Server: ACK (ack=y+1)
This synchronizes sequence numbers and establishes bidirectional communication.
The server side of the handshake is handled by listen_and_accept,
which waits for a SYN packet and then delegates the response to handle_syn:
async def listen_and_accept(self) -> bool:
"""Listen for incoming connection (server side)."""
print(f"[{self.now:.1f}] TCP {self.local_addr}:{self.local_port}: Listening...")
# Wait for SYN
while True:
packet = await self.recv_queue.get()
if packet.packet_type == PacketType.SYN:
await self.handle_syn(packet)
break
# Wait for final ACK
timeout_time = self.now + 5.0
while self.state != ConnectionState.ESTABLISHED and self.now < timeout_time:
await self.timeout(0.1)
return self.state == ConnectionState.ESTABLISHED
async def handle_syn(self, packet: Packet) -> None:
"""Handle incoming SYN."""
print(
f"[{self.now:.1f}] TCP: Received SYN from "
f"{packet.src_addr}:{packet.src_port}"
)
self.remote_addr = packet.src_addr
self.remote_port = packet.src_port
self.recv_seq = packet.seq_num + 1
self.recv_buffer.next_expected_seq = self.recv_seq
# Send SYN-ACK
self.state = ConnectionState.SYN_RECEIVED
syn_ack = Packet(
src_addr=self.local_addr,
dst_addr=packet.src_addr,
src_port=self.local_port,
dst_port=packet.src_port,
seq_num=self.send_seq,
ack_num=self.recv_seq,
packet_type=PacketType.SYN_ACK,
)
await self.network.send_packet(syn_ack)
print(
f"[{self.now:.1f}] TCP: Sent SYN-ACK (seq={self.send_seq}, "
f"ack={self.recv_seq})"
)
Sliding Window Protocol
The sliding window allows multiple packets in flight:
async def send(self, data: bytes) -> None:
"""Send data reliably using TCP."""
if self.state != ConnectionState.ESTABLISHED:
print(f"[{self.now:.1f}] TCP: Cannot send - not connected")
return
# Split data into segments (respecting MTU)
offset = 0
while offset < len(data):
chunk = data[offset : offset + self.mtu]
# Wait if send window is full
while len(self.send_buffer) >= self.window_size:
await self.timeout(0.1)
# Create and send segment
seq_num = self.next_seq_num
segment = Packet(
src_addr=self.local_addr,
dst_addr=self.remote_addr,
src_port=self.local_port,
dst_port=self.remote_port,
seq_num=seq_num,
ack_num=self.recv_buffer.next_expected_seq,
packet_type=PacketType.DATA,
data=chunk,
)
await self.network.send_packet(segment)
# Add to send buffer for potential retransmission
buffer_entry = SegmentBuffer(
seq_num=seq_num, data=chunk, sent_time=self.now
)
self.send_buffer.append(buffer_entry)
self.next_seq_num += len(chunk)
self.bytes_sent += len(chunk)
print(f"[{self.now:.1f}] TCP: Sent DATA (seq={seq_num}, len={len(chunk)})")
# Start retransmission timer for this segment
RetransmissionTimer(self._env, self, buffer_entry)
offset += len(chunk)
The sender can have window_size unacknowledged packets in flight.
This maintains throughput even with high latency:
new packets are being sent while waiting for ACKs from earlier packets.
Retransmission on Timeout
If an ACK doesn't arrive within the timeout period,
the segment is retransmitted.
We use a separate Process to simulate each retransmission timer:
class RetransmissionTimer(Process):
"""Timer process for retransmitting unacknowledged segments."""
def init(self, connection: "TCPConnection", segment: SegmentBuffer) -> None:
self.connection = connection
self.segment = segment
async def run(self) -> None:
"""Wait for timeout, then retransmit if not acknowledged."""
await self.timeout(self.connection.rto)
# Check if still in send buffer (not yet acknowledged)
if self.segment in self.connection.send_buffer:
print(
f"[{self.now:.1f}] TCP: TIMEOUT - Retransmitting "
f"seq={self.segment.seq_num}"
)
self.connection.packets_retransmitted += 1
self.segment.retransmit_count += 1
self.segment.sent_time = self.now
# Retransmit
packet = Packet(
src_addr=self.connection.local_addr,
dst_addr=self.connection.remote_addr,
src_port=self.connection.local_port,
dst_port=self.connection.remote_port,
seq_num=self.segment.seq_num,
ack_num=self.connection.recv_buffer.next_expected_seq,
packet_type=PacketType.DATA,
data=self.segment.data,
)
await self.connection.network.send_packet(packet)
# Restart timer
RetransmissionTimer(self._env, self.connection, self.segment)
Each time we sent a segment, we create a RetransmissionTimer:
# In TCPConnection.send():
await self.network.send_packet(segment)
# Add to send buffer
self.send_buffer.append(buffer_entry)
# Start retransmission timer
RetransmissionTimer(self._env, self, buffer_entry)
Each segment has its own timer Process.
If the segment hasn't been acknowledged (i.e., is still in the send buffer)
when the timer expires,
it is retransmitted and a new timer starts.
When a DATA packet arrives,
handle_data adds it to the receive buffer,
extracts any newly contiguous data to deliver to the application,
and sends a cumulative ACK:
async def handle_data(self, packet: Packet) -> None:
"""Handle DATA packet."""
seq_num = packet.seq_num
data = packet.data
print(f"[{self.now:.1f}] TCP: Received DATA (seq={seq_num}, len={len(data)})")
# Add to receive buffer
self.recv_buffer.add_segment(seq_num, data)
# Extract continuous data
continuous_data = self.recv_buffer.get_continuous_data()
if continuous_data:
self.bytes_received += len(continuous_data)
await self.data_ready.put(continuous_data)
print(
f"[{self.now:.1f}] TCP: Delivered {len(continuous_data)} "
f"bytes to application"
)
# Send ACK
ack = Packet(
src_addr=self.local_addr,
dst_addr=packet.src_addr,
src_port=self.local_port,
dst_port=packet.src_port,
seq_num=self.next_seq_num,
ack_num=self.recv_buffer.next_expected_seq,
packet_type=PacketType.ACK,
)
await self.network.send_packet(ack)
print(
f"[{self.now:.1f}] TCP: Sent ACK (ack={self.recv_buffer.next_expected_seq})"
)
Handling Out-of-Order Delivery
The receive buffer handles segments arriving out of order:
@dataclass
class ReceiveBuffer:
"""Buffer for out-of-order received segments."""
segments: dict[int, bytes] = field(default_factory=dict)
next_expected_seq: int = 0
def add_segment(self, seq_num: int, data: bytes) -> None:
"""Add a segment to the receive buffer."""
if seq_num >= self.next_expected_seq:
self.segments[seq_num] = data
def get_continuous_data(self) -> bytes:
"""Extract continuous data starting from next_expected_seq."""
result = b""
current_seq = self.next_expected_seq
while current_seq in self.segments:
segment = self.segments.pop(current_seq)
result += segment
current_seq += len(segment)
if result:
self.next_expected_seq = current_seq
return result
def has_data(self) -> bool:
"""Check if buffer has any segments."""
return len(self.segments) > 0
Segments are held until all gaps are filled. When a contiguous block of data is available, it's delivered to the application in order.
Cumulative Acknowledgments
TCP uses cumulative ACKs, i.e., each ACK indicates all data up to a sequence number has been received:
async def handle_ack(self, packet: Packet) -> None:
"""Handle ACK packet."""
ack_num = packet.ack_num
# Remove acknowledged segments from send buffer
acknowledged = []
for seg in self.send_buffer[:]:
if seg.seq_num < ack_num:
acknowledged.append(seg)
self.send_buffer.remove(seg)
if acknowledged:
self.send_base = ack_num
print(
f"[{self.now:.1f}] TCP: ACK {ack_num} "
f"(acknowledged {len(acknowledged)} segments)"
)
A single ACK can acknowledge multiple segments. This is simpler than selective acknowledgments and works well when most data arrives in order.
Basic Example
Let's see TCP in action:
def main():
env = Environment()
print("## Basic TCP Demonstration")
print("Testing TCP reliability with:")
print(" - 15% packet loss")
print(" - 10% packet reordering")
print(" - 5% packet duplication")
# Create unreliable network
network = UnreliableNetwork(
env,
loss_rate=0.15, # 15% packet loss
reorder_rate=0.10, # 10% reordering
duplicate_rate=0.05, # 5% duplication
)
# Create server connection
server_conn = TCPConnection(
env, "192.168.1.100", 8080, network, window_size=4, timeout=1.5
)
# Create client connection
client_conn = TCPConnection(
env, "192.168.1.101", 9000, network, window_size=4, timeout=1.5
)
# Create applications
TCPServer(env, server_conn)
message = (
"Hello from TCP client! This message will be delivered reliably "
"despite packet loss, reordering, and duplication. TCP ensures "
"that every byte arrives in the correct order through sequence "
"numbers, acknowledgments, and retransmission."
)
TCPClient(env, client_conn, "192.168.1.100", 8080, message)
# Run simulation
env.run(until=20)
# Print statistics
network.print_statistics()
client_conn.print_statistics()
server_conn.print_statistics()
## Basic TCP Demonstration
Testing TCP reliability with:
- 15% packet loss
- 10% packet reordering
- 5% packet duplication
[0.0] Network: Started (loss=15%, reorder=10%)
[0.0] Network: Registered 192.168.1.100:8080
[0.0] TCP 192.168.1.100:8080: Created (ISN=7830)
[0.0] Network: Registered 192.168.1.101:9000
[0.0] TCP 192.168.1.101:9000: Created (ISN=8964)
[0.0] TCP 192.168.1.100:8080: Listening...
[0.0] TCP: Starting connection to 192.168.1.100:8080
[0.8] TCP: Sent SYN (seq=8964)
[0.8] TCP: Received SYN from 192.168.1.101:9000
[0.9] TCP: Sent SYN-ACK (seq=7830, ack=8965)
[0.9] TCP: Received SYN-ACK (seq=7830, ack=8965)
[0.9] Network: LOST Packet(ACK, seq=8965, ack=7831, len=0)
[0.9] TCP: Sent ACK, connection established
[1.0] TCP: Connection ESTABLISHED
[1.0] Client: Sending message (232 bytes)
Message: 'Hello from TCP client! This message will be delive...'
[1.0] Network: LOST Packet(DATA, seq=8964, ack=7831, len=232)
[1.0] TCP: Sent DATA (seq=8964, len=232)
[2.5] TCP: TIMEOUT - Retransmitting seq=8964
[3.0] Client: Done sending
[4.2] TCP: TIMEOUT - Retransmitting seq=8964
[4.6] TCP: Received DATA (seq=8964, len=232)
[5.2] TCP: Sent ACK (ack=8965)
[5.2] TCP: ACK 8965 (acknowledged 1 segments)
### Network Statistics:
Packets sent: 7
Packets lost: 2 (28.6%)
Packets reordered: 2
Packets duplicated: 0
### TCP Connection Statistics (192.168.1.101:9000):
Bytes sent: 232
Bytes received: 0
Packets retransmitted: 2
Send buffer size: 0
### TCP Connection Statistics (192.168.1.100:8080):
Bytes sent: 0
Bytes received: 0
Packets retransmitted: 0
Send buffer size: 0
Despite 15% packet loss and reordering, TCP successfully delivers the complete message in order.
High Loss Scenario
Let's test TCP under extreme conditions:
def main():
env = Environment()
print("## High Loss TCP Scenario")
print("Testing TCP robustness with extreme conditions:")
print(" - 40% packet loss (!!)")
print(" - 20% packet reordering")
print(" - 10% packet duplication")
print(" - Larger message requiring multiple segments")
# Extremely unreliable network
network = UnreliableNetwork(
env,
loss_rate=0.40, # 40% packet loss!
reorder_rate=0.20, # 20% reordering
duplicate_rate=0.10, # 10% duplication
delay_range=(0.2, 0.6),
)
# TCP with aggressive retransmission
server_conn = TCPConnection(
env,
"10.0.0.1",
5000,
network,
window_size=3,
timeout=1.0, # Faster retransmit
)
client_conn = TCPConnection(
env, "10.0.0.2", 6000, network, window_size=3, timeout=1.0
)
# Create applications
TCPServer(env, server_conn)
# Transfer larger message that will require multiple segments
message = (
"This is a much longer message that will be split into multiple "
"TCP segments. Despite 40% packet loss - which is extremely high - "
"TCP will successfully deliver every byte through retransmission. "
"You'll see many timeouts and retransmissions, but eventually "
"the complete message arrives in perfect order. "
) * 5 # Repeat 5 times
TCPClient(env, client_conn, "10.0.0.1", 5000, message)
# Run simulation (longer time for high loss)
env.run(until=40)
# Print statistics
network.print_statistics()
client_conn.print_statistics()
server_conn.print_statistics()
# Verify delivery
print("## Verification:")
expected_bytes = len(message.encode("utf-8"))
if server_conn.bytes_received == expected_bytes:
print(f"Success: All {expected_bytes} bytes delivered correctly!")
else:
print(f"Incomplete: {server_conn.bytes_received}/{expected_bytes} bytes")
## High Loss TCP Scenario
Testing TCP robustness with extreme conditions:
- 40% packet loss (!!)
- 20% packet reordering
- 10% packet duplication
- Larger message requiring multiple segments
[0.0] Network: Started (loss=40%, reorder=20%)
[0.0] Network: Registered 10.0.0.1:5000
[0.0] TCP 10.0.0.1:5000: Created (ISN=7830)
[0.0] Network: Registered 10.0.0.2:6000
[0.0] TCP 10.0.0.2:6000: Created (ISN=8964)
[0.0] TCP 10.0.0.1:5000: Listening...
[0.0] TCP: Starting connection to 10.0.0.1:5000
[0.9] TCP: Sent SYN (seq=8964)
[0.9] TCP: Received SYN from 10.0.0.2:6000
[1.1] TCP: Sent SYN-ACK (seq=7830, ack=8965)
[1.1] TCP: Received SYN-ACK (seq=7830, ack=8965)
[1.1] Network: LOST Packet(ACK, seq=8965, ack=7831, len=0)
[1.1] TCP: Sent ACK, connection established
[1.2] TCP: Connection ESTABLISHED
[1.2] Client: Sending message (1510 bytes)
Message: 'This is a much longer message that will be split i...'
[1.2] Network: LOST Packet(DATA, seq=8964, ack=7831, len=1400)
[1.2] TCP: Sent DATA (seq=8964, len=1400)
[1.2] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[1.2] TCP: Sent DATA (seq=10364, len=110)
[2.2] TCP: TIMEOUT - Retransmitting seq=8964
[2.2] TCP: TIMEOUT - Retransmitting seq=10364
[2.7] TCP: Received DATA (seq=8964, len=1400)
[2.7] Network: LOST Packet(ACK, seq=7830, ack=8965, len=0)
[2.7] TCP: Sent ACK (ack=8965)
[3.2] Client: Done sending
[3.6] TCP: TIMEOUT - Retransmitting seq=10364
[3.6] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[3.7] TCP: TIMEOUT - Retransmitting seq=8964
[3.7] Network: LOST Packet(DATA, seq=8964, ack=7831, len=1400)
[4.6] TCP: TIMEOUT - Retransmitting seq=10364
[4.6] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[4.7] TCP: TIMEOUT - Retransmitting seq=8964
[5.6] TCP: TIMEOUT - Retransmitting seq=10364
[5.6] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[6.5] TCP: TIMEOUT - Retransmitting seq=8964
[6.6] TCP: TIMEOUT - Retransmitting seq=10364
[7.0] TCP: Received DATA (seq=8964, len=1400)
[7.0] Network: LOST Packet(ACK, seq=7830, ack=8965, len=0)
[7.0] TCP: Sent ACK (ack=8965)
[8.0] TCP: TIMEOUT - Retransmitting seq=8964
[8.3] TCP: Received DATA (seq=8964, len=1400)
[8.5] TCP: TIMEOUT - Retransmitting seq=10364
[8.5] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[9.1] TCP: Sent ACK (ack=8965)
[9.1] TCP: ACK 8965 (acknowledged 1 segments)
[9.5] TCP: TIMEOUT - Retransmitting seq=10364
[11.3] TCP: TIMEOUT - Retransmitting seq=10364
[11.3] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[12.3] TCP: TIMEOUT - Retransmitting seq=10364
[13.1] TCP: Received DATA (seq=10364, len=110)
[13.5] TCP: Sent ACK (ack=8965)
[14.1] TCP: TIMEOUT - Retransmitting seq=10364
[14.1] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[15.1] TCP: TIMEOUT - Retransmitting seq=10364
[15.1] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[16.1] TCP: TIMEOUT - Retransmitting seq=10364
[16.1] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[17.1] TCP: TIMEOUT - Retransmitting seq=10364
[18.5] TCP: TIMEOUT - Retransmitting seq=10364
[19.1] TCP: Received DATA (seq=10364, len=110)
[19.1] Network: LOST Packet(ACK, seq=7830, ack=8965, len=0)
[19.1] TCP: Sent ACK (ack=8965)
[20.1] TCP: TIMEOUT - Retransmitting seq=10364
[21.6] TCP: TIMEOUT - Retransmitting seq=10364
[22.2] TCP: Received DATA (seq=10364, len=110)
[22.5] TCP: Sent ACK (ack=8965)
[23.2] TCP: TIMEOUT - Retransmitting seq=10364
[24.8] TCP: TIMEOUT - Retransmitting seq=10364
[24.8] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[25.8] TCP: TIMEOUT - Retransmitting seq=10364
[25.8] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[26.8] TCP: TIMEOUT - Retransmitting seq=10364
[26.8] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[27.8] TCP: TIMEOUT - Retransmitting seq=10364
[28.9] TCP: Received DATA (seq=10364, len=110)
[29.2] TCP: Sent ACK (ack=8965)
[29.9] TCP: TIMEOUT - Retransmitting seq=10364
[31.1] TCP: TIMEOUT - Retransmitting seq=10364
[31.1] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[32.1] TCP: TIMEOUT - Retransmitting seq=10364
[32.4] TCP: Received DATA (seq=10364, len=110)
[32.4] Network: LOST Packet(ACK, seq=7830, ack=8965, len=0)
[32.4] TCP: Sent ACK (ack=8965)
[33.4] TCP: TIMEOUT - Retransmitting seq=10364
[33.4] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[34.4] TCP: TIMEOUT - Retransmitting seq=10364
[34.4] Network: LOST Packet(DATA, seq=10364, ack=7831, len=110)
[35.4] TCP: TIMEOUT - Retransmitting seq=10364
[36.9] TCP: TIMEOUT - Retransmitting seq=10364
[38.0] TCP: Received DATA (seq=10364, len=110)
[38.4] TCP: Sent ACK (ack=8965)
[39.0] TCP: TIMEOUT - Retransmitting seq=10364
### Network Statistics:
Packets sent: 48
Packets lost: 22 (45.8%)
Packets reordered: 8
Packets duplicated: 0
### TCP Connection Statistics (10.0.0.2:6000):
Bytes sent: 1510
Bytes received: 0
Packets retransmitted: 34
Send buffer size: 1
### TCP Connection Statistics (10.0.0.1:5000):
Bytes sent: 0
Bytes received: 0
Packets retransmitted: 0
Send buffer size: 0
## Verification:
Incomplete: 0/1510 bytes
Even with 40% loss, TCP delivers the complete message: there are many retransmissions but eventual success.
Key TCP Concepts
Sequence Number Space: Each byte transmitted has a unique sequence number. For a message "Hello":
- Byte 'H' might be seq=1000
- Byte 'e' is seq=1001
- Byte 'l' is seq=1002
- And so on...
When acknowledging, the receiver sends back seq=1005, meaning, "I've received everything up to 1005."
Window Size and Throughput: The window size determines maximum throughput. With:
- Window size = 4 segments
- Segment size = 1000 bytes
- Round-trip time = 0.5 seconds
the maximum throughput is approximately (4 × 1000) / 0.5 = 8000 bytes/second. Larger windows enable higher throughput but require more buffering.
Adaptive Retransmission: Our implementation uses a fixed timeout. Real TCP measures round-trip time and adapts the timeout:
RTO = smoothed_RTT + 4 × RTT_variance
This balances responsiveness (short timeout) with avoiding spurious retransmissions (long timeout).
Fast Retransmit: An optimization not in our implementation is that if the sender receives three duplicate ACKs for the same sequence number, it immediately retransmits without waiting for timeout. This recovers from loss faster.
Congestion Control
Our implementation uses a fixed window size. Real TCP adapts its window dynamically based on network conditions using Additive Increase, Multiplicative Decrease (AIMD).
The sender maintains a congestion window (cwnd).
It has two phases:
Slow Start: begin with cwnd = 1 segment and double it every RTT.
Double may sound fast, but starting from 1 prevents overwhelming a slow link on connection open.
Congestion Avoidance: once cwnd exceeds a threshold (ssthresh),
increase by 1 segment per RTT instead of doubling.
This probes for additional capacity without overshooting.
Multiplicative Decrease: when a segment times out, halve cwnd and ssthresh.
Starting slow again recovers quickly because each RTT doubles cwnd back to ssthresh.
Fast Retransmit: three consecutive duplicate ACKs for the same sequence number
means a segment was probably lost (not just delayed).
The sender retransmits immediately without waiting for the timeout—
this recovers from a single drop faster than a 2-second timeout would.
On a fast retransmit, cwnd is halved (less severe than a timeout, which resets to 1).
Here is the congestion state machine:
@dataclass
class CongestionState:
"""Congestion window state for one TCP sender.
cwnd (congestion window):
Maximum number of unacknowledged segments allowed.
ssthresh (slow-start threshold):
cwnd below this → slow start (exponential); above → congestion avoidance (linear).
dup_ack_count:
Number of consecutive duplicate ACKs for the same sequence number.
"""
cwnd: float = 1.0 # Start with one segment
ssthresh: float = 16.0 # Initial threshold (segments)
dup_ack_count: int = 0
last_acked_seq: int = -1
def on_new_ack(self) -> None:
"""A new (non-duplicate) ACK arrived: increase cwnd."""
self.dup_ack_count = 0
if self.cwnd < self.ssthresh:
# Slow start: double cwnd each RTT (approximately 1 per ACK).
self.cwnd += 1.0
else:
# Congestion avoidance: increase by 1/cwnd per ACK ≈ +1 per RTT.
self.cwnd += 1.0 / self.cwnd
def on_duplicate_ack(self, seq: int) -> bool:
"""A duplicate ACK arrived. Returns True if fast retransmit should fire."""
if seq != self.last_acked_seq:
self.dup_ack_count = 1
self.last_acked_seq = seq
else:
self.dup_ack_count += 1
return self.dup_ack_count >= FAST_RETRANSMIT_THRESHOLD
def on_timeout(self) -> None:
"""Timeout detected: multiplicative decrease and restart slow start."""
self.ssthresh = max(self.cwnd / 2.0, 2.0)
self.cwnd = 1.0 # Back to slow start
self.dup_ack_count = 0
def on_fast_retransmit(self) -> None:
"""Three duplicate ACKs: less severe than timeout, so halve cwnd."""
self.ssthresh = max(self.cwnd / 2.0, 2.0)
self.cwnd = self.ssthresh # Stay in congestion avoidance
self.dup_ack_count = 0
def effective_window(self, receiver_window: int) -> int:
"""Effective window is the min of cwnd and the receiver's advertised window."""
return min(int(self.cwnd), receiver_window)
And the sender that uses it:
class CongestionControlledSender(Process):
"""TCP sender with AIMD congestion control.
This class demonstrates how the congestion window interacts with the
retransmission timer and fast-retransmit to dynamically adjust the
sending rate. It is intentionally separate from TCPConnection so
the congestion-control logic is easy to read in isolation.
"""
def init(
self,
sender_id: str,
network: UnreliableNetwork,
remote_addr: str,
remote_port: int,
data: list[str],
) -> None:
self.sender_id = sender_id
self.network = network
self.remote_addr = remote_addr
self.remote_port = remote_port
self.data = data
self.local_addr = sender_id
self.local_port = 9000
# Register receive queue with the network.
receive_queue: Queue = Queue(self._env)
self.network.register(self.local_addr, self.local_port, receive_queue)
self.receive_queue = receive_queue
self.cong = CongestionState()
self.next_seq: int = 0
self.unacked: dict[int, str] = {} # seq -> data
self.ack_queue: Queue = Queue(self._env)
# Statistics
self.cwnd_history: list[tuple[float, float]] = []
async def run(self) -> None:
"""Send all data, adapting window based on ACKs and losses."""
for chunk in self.data:
# Wait until the congestion window allows another segment in flight.
while len(self.unacked) >= self.cong.effective_window(64):
await self.timeout(0.05)
seq = self.next_seq
self.next_seq += 1
self.unacked[seq] = chunk
pkt = Packet(
src_addr=self.local_addr,
dst_addr=self.remote_addr,
src_port=self.local_port,
dst_port=self.remote_port,
seq_num=seq,
ack_num=0,
packet_type=PacketType.DATA,
data=chunk,
)
await self.network.send_packet(pkt)
_TimeoutChecker(self._env, self, seq, chunk)
self.cwnd_history.append((self.now, self.cong.cwnd))
print(
f"[{self.now:.2f}] {self.sender_id}: Sent seq={seq} "
f"cwnd={self.cong.cwnd:.1f} in_flight={len(self.unacked)}"
)
# Wait for all ACKs.
while self.unacked:
await self.timeout(0.1)
print(f"[{self.now:.2f}] {self.sender_id}: All data delivered.")
def on_ack(self, ack_seq: int) -> None:
"""Called when an ACK arrives."""
if ack_seq in self.unacked:
del self.unacked[ack_seq]
self.cong.on_new_ack()
self.cong.last_acked_seq = ack_seq
print(
f"[{self.now:.2f}] {self.sender_id}: ACK seq={ack_seq} "
f"cwnd={self.cong.cwnd:.1f}"
)
else:
# Duplicate ACK.
fire = self.cong.on_duplicate_ack(ack_seq)
if fire:
print(
f"[{self.now:.2f}] {self.sender_id}: "
f"3 dup ACKs for seq={ack_seq} — fast retransmit"
)
self.cong.on_fast_retransmit()
AIMD is the reason the internet does not collapse under load. If all senders backed off linearly on loss, they would overshoot and undershoot repeatedly. The multiplicative decrease is aggressive enough to clear congestion quickly, while additive increase is conservative enough that multiple flows converge to equal shares.
Real-World Considerations
Production TCP implementations include features we've omitted:
-
Selective Acknowledgments (SACK) acknowledge non-contiguous blocks of data, which allows efficient retransmission of only the missing segments. Without SACK, a lost segment in the middle of a window forces retransmission of everything after it.
-
Flow Control (distinct from congestion control): the receiver advertises its available buffer space in each ACK's window field. The sender limits in-flight data to the min of
cwndand the receiver's window. This prevents a fast sender from overwhelming a slow receiver's buffer. Our implementation usescwndas the only window—adding the receiver window is left as an exercise. -
Nagle's Algorithm batches small writes to reduce overhead.
-
Path MTU Discovery determines maximum packet size without fragmentation by testing increasingly large packets.
-
Timestamp Options improve RTT estimation and detect wrapped sequence numbers on high-speed networks.
Performance Analysis
Let's analyze our TCP implementation's efficiency: With 15% loss and 4-segment window:
- approximately 15% of packets need retransmission;
- the window allows 4 segments in flight; and
- the average delay per segment is approximately RTT + (loss_rate × timeout)
The throughput efficiency is therefore
efficiency = successful_transmission_rate / available_bandwidth
≈ (1 - loss_rate) / (1 + loss_rate × retransmissions)
Exercises
-
Run the basic example with 0% loss, then with 30% loss. Count the number of retransmissions in each case. Now increase the window size from 4 to 8. Does a larger window help or hurt under high loss? Why?
-
The fixed retransmission timeout is set to a constant. Real TCP uses the Karn/Jacobson formula:
RTO = smoothed_RTT + 4 * RTT_variance. Addsmoothed_rttandrtt_variancefields toTCPConnection. When an ACK arrives, update them:smoothed_rtt = 0.875 * smoothed_rtt + 0.125 * sample_rttandrtt_variance = 0.75 * rtt_variance + 0.25 * |sample_rtt - smoothed_rtt|. (Starter: recordsent_timeinSegmentBufferand computesample_rttwhen the ACK arrives.) -
Trace through the congestion control state machine for this sequence:
- cwnd starts at 1, ssthresh = 8.
- ACKs arrive for seq 0, 1, 2, 3 (no loss).
- What is cwnd after each ACK (draw the table)?
- Segment 4 times out.
- What are the new cwnd and ssthresh?
- ACKs arrive for seq 4, 5, 6.
- What is cwnd after each ACK?
-
The receiver advertises a window in every ACK (flow control). Suppose the receiver's buffer is only 2 segments. Modify
CongestionControlledSenderto accept areceiver_windowparameter and limit in-flight data tomin(cwnd, receiver_window). What happens when the receiver's window is smaller than cwnd? -
SACK (Selective Acknowledgment) allows the receiver to acknowledge non-contiguous ranges of received data. Without SACK, if segment 3 is lost but segments 4–7 arrive, the sender must retransmit segment 3 and cannot know whether 4–7 need retransmission. Describe the data structure the receiver would need to send SACK information, and how the sender would use it to retransmit only the missing segment.