TCP/IP: The Protocol That Connected Everything

#TL;DR

By 1973, ARPANET worked — but it was a walled garden. Its protocol, NCP, assumed a single, trusted network that never dropped packets. The real world had dozens of incompatible networks, and none of them could talk to each other. Vint Cerf and Bob Kahn solved this in a hotel lobby: design a protocol that treats every network as unreliable and does all the hard work at the endpoints, not in the middle. TCP/IP became the universal language of the internet. On January 1, 1983, ARPANET flipped a switch and the modern internet was born.

#The Problem Kahn Brought to Stanford

In the spring of 1973, Bob Kahn walked into Vint Cerf’s office at Stanford with what he called “a catastrophic problem.”

ARPANET had succeeded. But it was only one network. The U.S. military now had satellite networks, mobile packet radio networks, and ARPANET — and none of them could exchange data. Each used different packet sizes, different addressing schemes, different error-handling rules. Connecting them through gateways just made the problem worse: every gateway had to understand every network’s dialect.

Kahn had spent months thinking about this and had four requirements:

Each network should work internally without changes
If a packet doesn’t arrive, it should be retransmitted
Gateways shouldn’t hold state — if one fails, packets reroute
There should be no global control

He had the constraints. He didn’t have the solution. He needed a theorist.

Cerf was that theorist.

#A Hotel Lobby in Palo Alto

Over the next few weeks, the two met wherever they could — offices, coffee shops, and most productively, the lobby of Rickey’s Hyatt House in Palo Alto. Cerf sketched diagrams on napkins. Kahn argued about failure modes. The core insight that emerged was deceptively simple:

The network is dumb. The endpoints are smart.

Instead of building a reliable network — which required every device in the middle to cooperate — they’d build an unreliable network and make the endpoints responsible for recovering from that unreliability. Gateways (later called routers) would do one thing: forward packets. They wouldn’t store state, wouldn’t verify delivery, wouldn’t do anything except look at the destination address and send the packet on.

This became known as the end-to-end principle: put intelligence at the edges, not the core. It’s one of the most consequential design decisions in computing history.

In May 1974, Cerf and Kahn published “A Protocol for Packet Network Intercommunication” in IEEE Transactions on Communications. It described a single protocol — TCP — that would handle everything.

#The Split: TCP Becomes TCP/IP

The original TCP tried to do too much. It handled addressing (where does this packet go?), routing (how does it get there?), and reliability (what happens if it doesn’t?). This was fine for a research paper. It was a nightmare for implementation.

Jon Postel — the same RFC editor who would later run the internet’s naming and numbering systems — pushed back. These were two fundamentally different problems:

Routing is a network-level concern. Every packet needs an address and a path.
Reliability is a transport-level concern. Only some applications need guaranteed delivery.

A live video stream can tolerate dropped packets. An email transfer cannot. Baking reliability into the network layer meant video would carry the overhead of guaranteed delivery whether it wanted it or not.

Postel’s argument won. In 1978, TCP was split into two:

IP (Internet Protocol) — handles addressing and routing. Fire-and-forget.
TCP (Transmission Control Protocol) — sits on top of IP, adds reliability.

Applications that needed speed over reliability could use UDP (User Datagram Protocol) directly over IP. The result was an hourglass:

  HTTP  FTP  SMTP  DNS  ...    ← many application protocols
       ╲  │  │  ╱
        ╲ │  │ ╱
    ┌────────────────┐
    │   TCP  │  UDP  │         ← transport layer
    ├────────────────┤
    │       IP       │         ← the narrow waist
    └────────────────┘
        ╱ │  │ ╲
       ╱  │  │  ╲
  Wi-Fi  Ethernet  4G  ...    ← many link layer technologies

IP is the narrow waist. Every link technology below it and every application above it speaks IP. This is why you can check email over Wi-Fi, Ethernet, or cellular — the network doesn’t care. The application doesn’t care. They both speak IP.

#IP: Every Machine Gets an Address

IP version 4 gave every device on the internet a 32-bit address — four numbers from 0 to 255, separated by dots. 192.168.1.1. Your router has one. The server hosting this page has one. In 1974, 32 bits seemed like plenty: 4.3 billion addresses for a planet with 4 billion people.

(It wasn’t. But that’s a 1990s problem.)

IP routing works like postal mail. Each packet carries a destination address. Routers don’t know the full path — they only know the next hop. Each router passes the packet toward its destination based on a routing table, and the packet makes its way across the internet one hop at a time.

You → Home router → ISP router → ... → Destination
         "I'll send it       "I'll send it
          to my upstream"     to the next AS"

Routers cooperate via protocols like BGP (Border Gateway Protocol) to build and share these routing tables. They don’t need a map of the whole internet — they just need to know the next step. This is decentralization in practice: no central router knows everything, but packets find their way anyway.

#TCP: Making an Unreliable Network Reliable

IP delivers packets best-effort. They can arrive out of order, get duplicated, or vanish entirely. TCP’s job is to hide all of that from the application.

Before two machines exchange data, TCP establishes a connection with a three-way handshake:

Client                    Server
  │                          │
  │──── SYN ────────────────>│   "I want to connect (seq=x)"
  │                          │
  │<─── SYN-ACK ─────────────│   "OK, I'm ready (seq=y, ack=x+1)"
  │                          │
  │──── ACK ────────────────>│   "Great, let's go (ack=y+1)"
  │                          │
  │═══════ data flows ═══════│

Once connected, TCP numbers every byte it sends. The receiver sends back acknowledgements (ACKs) for every byte it gets. If the sender doesn’t receive an ACK in time, it retransmits. The receiver reorders out-of-sequence packets. Duplicates are discarded.

import socket

# TCP connection — the OS handles SYN/SYN-ACK/ACK, retransmits, ordering
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect(("example.com", 80))
    s.sendall(b"GET / HTTP/1.0\r\nHost: example.com\r\n\r\n")
    response = s.recv(4096)

# UDP — raw IP, no handshake, no guarantees
with socket.socket(socket.AF_INET, socket.SOCK_DGRAM) as s:
    s.sendto(b"ping", ("8.8.8.8", 53))
    data, addr = s.recvfrom(512)

TCP also does flow control (don’t send faster than the receiver can process) and congestion control (don’t send faster than the network can handle). These algorithms — Nagle’s, Slow Start, CUBIC — are still being tuned today. Modern HTTP/3 replaces TCP with QUIC, a custom UDP-based protocol, partly to escape TCP’s head-of-line blocking. But that’s Era 5 territory.

#Flag Day: January 1, 1983

For eight years after the 1974 paper, TCP/IP and NCP coexisted. Implementing TCP/IP was optional. Then the U.S. Department of Defense made it mandatory — and set a deadline.

January 1, 1983. Every machine on ARPANET would switch from NCP to TCP/IP simultaneously. There was no gradual rollout, no backward compatibility period. If your machine didn’t speak TCP/IP by midnight, it was off the network.

Administrators called it “flag day”. Some thought it was reckless. But it worked. ARPANET woke up on January 1 speaking a new language — the same language the internet speaks today.

4.2BSD Unix, released in 1983, shipped TCP/IP as part of the operating system. Suddenly any Unix machine could connect. The internet stopped being a government research project and started being infrastructure.

#What TCP/IP Got Right

The design decisions made in a hotel lobby in 1973 have held up for fifty years:

Dumb network, smart endpoints — the internet has scaled to billions of devices because routers don’t need to understand the traffic they forward
Layering — Wi-Fi replaced Ethernet, IPv6 is slowly replacing IPv4, QUIC is challenging TCP — all without breaking applications
Best-effort delivery — embracing unreliability at the IP level, rather than pretending the network is reliable, made the whole system more honest and more robust
Open standards — anyone could implement TCP/IP. IBM didn’t own it. AT&T didn’t own it. Anyone with the RFC could build a compliant stack.

The one thing it didn’t solve: names. An IP address like 192.0.2.1 is not how humans navigate. There was no way to type google.com and have it mean anything — every machine kept a local hosts.txt file that mapped names to addresses, maintained by a single administrator at Stanford. By 1983, that file had thousands of entries and was updated twice a week. It couldn’t scale.

The solution was DNS — the Domain Name System — a distributed, hierarchical database that translates human-readable names into IP addresses. That’s the next stop.