This article is part of my series on Digital Infrastructure.
There are billions of machines connected to the internet. The number was 22 billion in 2019[1], and this is projected to keep growing significantly over the next few years. And it is all our fault, with the average person owning around 4 connected devices and projected to grow to ~10 by 2025[2].
All these machines need to communicate with each other to fulfil their purpose. We've seen before an example of this, what happens when you open a Tiktok on your phone. Another common example is when you text someone. What you see as being a simple conversation on Whatsapp such as:
Is in fact much more involved. For each message that you send:
This is still a simplified view. The number of exchanges actually happening between your phone and Whatsapp's server can easily grow to hundreds, if we take into account everything that is happening under the hood.
What all these exchanges have in common however is that they're a form of communication between two devices connected to the internet, also called an Internet request. This is what we will be discussing over this article, hopefully giving you an expansive understanding of how information is transported over the internet and the various kinds of requests there is.
There will be a fair amount of technical words introduced which you can use as a starting point to deepen your understanding, or completely forget about after reading the article. The concepts we'll cover are more important than technical jargon and I hope you'll carry them over with you. I am also playing it relatively loose with the chronology of events. My goal is to introduce you to the concepts at hand, rather than make you a computer historian.
The Internet is a giant computer network spanning the globe. The technology underlying it was envisioned in the 60s as a way to link together the few networks that existed at the time, which belonged to large companies and top universities in the US. The project was named ArpaNET and mostly funded by the American DARPA agency.
While the concept of a "Network of Networks" is intriguing, what is a computer network in the first place?
Historical anecdote time: early computers were big, costly, and shared by multiple people in the same organization, allowing them to "have time with the computer". These early systems allowed their users to communicate with each other using a concept similar to email. Assuming you knew the username of the person you wanted to communicate with, you could leave them a message and they'll be able to see it next time they connect to this shared machine. The "To" field contained something as simple as "feynman".
This was very convenient for business and research. A researcher could finish their seismic modeling calculations, and send the result via mail to a collaborator very conveniently. Similarly, an analyst could crunch the sales numbers from last quarter and send it to sales to produce a report for the VPs.
But all of this happened in isolation. In order to communicate with someone, they had to have an account on the same machine you sent the message from. Want to collaborate on a project with a professor from another university? Tough luck.
As computers became smaller, this limitation only became worse as fewer people started to have access to the same machine, and now you could not even communicate with the biology department and coordinate about the spring party.
Computer networks appeared as a way to solve this emailing problem.
A network is a way to connect computers together, most often via a networking device such as a router and with a physical connection either via cable or wirelessly.
It was finally possible to mail someone on another computer. But the username was not enough anymore, now your computer needs to know which computer is supposed to receive the email. So you would send it to "feynman@physics", meaning that your computer tries to locate a computer named "physics", and sends your mail to a user there named "feynman". This already starts to look like a modern email address.
In practice:
Privacy and cybersecurity were definitely not as big a concern back then.
You might be wondering, how is email related to networks and internet? You'll see in a moment that their growth and development are deeply intertwined, along with the need for organizing bigger and bigger groups of computers that might belong to people with opposing goals and ideals.
So far the method of networking we have discussed only works inside a single organization, and only for computers directly connected to the same network. A better way for organizing computers was needed. This is partly how the concept of "IP" was born.
IP (Internet Protocol) is a method to communicate with other machines over a network. One of the biggest improvements it brought along is the concept of an IP address, which is sort of a numbering system for machines. An IP address contains not only an identifier for the machine, but also a description of how to reach it similar to how a phone number might contain an area code or a prefix specific to a service provider. Let's take a closer look.
An IP address (or commonly called just "IP") contains 4 segments, each is a number from 0 to 255. Part of the IP address is used to identify the network, and the other part is used to identify the computer. Since IPs contain 4 segments, there are only 3 kinds of networks. They are conveniently named Class A, Class B and Class C networks, and the main difference is how many computers and devices can be connected within the same network.
In each IP network, there is a special device named the router, which is in charge of allowing devices within the same network to communicate, just like the older networks, and it also acts as a bridge with other networks as illustrated below. When a request is sent to an IP address that is outside the current network, then the router of your network seeks the router in charge of that other network to hand over the information.
IPs are also a form of digital real estate, with the supply extremely limited and under high demand. There are only 4 billion possible IP addresses. A later version of IP called IPv6 tries to resolve this by offering a virtually unlimited amount, but while the technology is there, adoption is another story and is still lagging severely behind, cementing the lack of available IPs as a form of digital scarcity.
Using the IP networking technology, we have a neat way to communicate with another computer as long as there's a cable running between them (or wifi, anachronistically). But for email to work well, there are still a few shortcomings we need to address:
Remember how we were sending emails earlier using feynman@physics? What if we used "physics" as an easy alias to remember a complicated number such as 45.99.186.20? Then if the IP address changes, we could just say that "physics" now refers to the new address. This naming system could work, but it comes with its own shortcomings:
And thus DNS (Domain Name System) was born! You are familiar with it through internet domains such as google.com, omarkama.li, or when you try to make your own website and find out you need to "buy a domain".
DNS solves the problems we mentioned earlier by providing:
Now you can contact [email protected] and behind the scenes, your computer will make a request to a DNS server asking it for the IP of the computer behind the alias "physics.princeton.edu". Once your computer gets the IP, it sends the mail which will be relayed to destination by various routers sitting between you and the computer you're communicating with.
In a future article, we'll answer questions such as who owns DNS, how DNS affects your gaming and Netflix experience, what you need to know to get your own domain, and many more questions.
Make sure to subscribe to my monthly newsletter to be informed when new articles are released, and feel free to contact me with suggestions, feedback, or topics you'd like me to talk about.
-