It starts with a message: Internet, DNS & Networks for non-Techies

This article is part of my series on Digital Infrastructure.

There are billions of machines connected to the internet. The number was 22 billion in 2019[1], and this is projected to keep growing significantly over the next few years. And it is all our fault, with the average person owning around 4 connected devices and projected to grow to ~10 by 2025[2].

All these machines need to communicate with each other to fulfil their purpose. We've seen before an example of this, what happens when you open a Tiktok on your phone. Another common example is when you text someone. What you see as being a simple conversation on Whatsapp such as:

How hard could it be to send 4 texts?

Is in fact much more involved. For each message that you send:

A request message is sent from your phone to a server operated by Whatsapp.
The server receives the request message, it sends a notification to your friend's phone and wait for it to come online.
The server sends a response message back to your phone with a confirmation that your message was "sent" with a single checkmark attached to it.
When your friend's phone comes online, it connects to a server operated by Whatsapp and retrieves the latest messages.
Your device receives an update marking the message in Whatsapp as "received" or "read" with a second checkmark attached to it.

This is still a simplified view. The number of exchanges actually happening between your phone and Whatsapp's server can easily grow to hundreds, if we take into account everything that is happening under the hood.

What all these exchanges have in common however is that they're a form of communication between two devices connected to the internet, also called an Internet request. This is what we will be discussing over this article, hopefully giving you an expansive understanding of how information is transported over the internet and the various kinds of requests there is.

There will be a fair amount of technical words introduced which you can use as a starting point to deepen your understanding, or completely forget about after reading the article. The concepts we'll cover are more important than technical jargon and I hope you'll carry them over with you. I am also playing it relatively loose with the chronology of events. My goal is to introduce you to the concepts at hand, rather than make you a computer historian.

Networks and the Genesis of the Internet

The Internet is a giant computer network spanning the globe. The technology underlying it was envisioned in the 60s as a way to link together the few networks that existed at the time, which belonged to large companies and top universities in the US. The project was named ArpaNET and mostly funded by the American DARPA agency.

While the concept of a "Network of Networks" is intriguing, what is a computer network in the first place?

An old PDP-7 - a powerhouse in its heyday - being restored in Oslo, Norway - en:User:Toresbe, CC SA 1.0 via Wikimedia Commons

Historical anecdote time: early computers were big, costly, and shared by multiple people in the same organization, allowing them to "have time with the computer". These early systems allowed their users to communicate with each other using a concept similar to email. Assuming you knew the username of the person you wanted to communicate with, you could leave them a message and they'll be able to see it next time they connect to this shared machine. The "To" field contained something as simple as "feynman".

Email is the first killer app for internet

This was very convenient for business and research. A researcher could finish their seismic modeling calculations, and send the result via mail to a collaborator very conveniently. Similarly, an analyst could crunch the sales numbers from last quarter and send it to sales to produce a report for the VPs.

But all of this happened in isolation. In order to communicate with someone, they had to have an account on the same machine you sent the message from. Want to collaborate on a project with a professor from another university? Tough luck.

As computers became smaller, this limitation only became worse as fewer people started to have access to the same machine, and now you could not even communicate with the biology department and coordinate about the spring party.

Computer networks appeared as a way to solve this emailing problem.

A network is a way to connect computers together, most often via a networking device such as a router and with a physical connection either via cable or wirelessly.

It was finally possible to mail someone on another computer. But the username was not enough anymore, now your computer needs to know which computer is supposed to receive the email. So you would send it to "feynman@physics", meaning that your computer tries to locate a computer named "physics", and sends your mail to a user there named "feynman". This already starts to look like a modern email address.

In practice:

Your computer will spam all other computers in the network it belongs to, asking them "Are you named `physics`? I have this mail for a user named `feynman`."
Other computers will ignore the request.
The "physics" computer replies saying it's me (with no proof whatsoever).
Your computer sends the mail to the "physics" computer.

Privacy and cybersecurity were definitely not as big a concern back then.

IP, a hierarchy of machines

You might be wondering, how is email related to networks and internet? You'll see in a moment that their growth and development are deeply intertwined, along with the need for organizing bigger and bigger groups of computers that might belong to people with opposing goals and ideals.

So far the method of networking we have discussed only works inside a single organization, and only for computers directly connected to the same network. A better way for organizing computers was needed. This is partly how the concept of "IP" was born.

IP (Internet Protocol) is a method to communicate with other machines over a network. One of the biggest improvements it brought along is the concept of an IP address, which is sort of a numbering system for machines. An IP address contains not only an identifier for the machine, but also a description of how to reach it similar to how a phone number might contain an area code or a prefix specific to a service provider. Let's take a closer look.

An IP address (or commonly called just "IP") contains 4 segments, each is a number from 0 to 255. Part of the IP address is used to identify the network, and the other part is used to identify the computer. Since IPs contain 4 segments, there are only 3 kinds of networks. They are conveniently named Class A, Class B and Class C networks, and the main difference is how many computers and devices can be connected within the same network.

Your home internet is probably a Class C network, which means you can have at most 256 devices connected to it.

In each IP network, there is a special device named the router, which is in charge of allowing devices within the same network to communicate, just like the older networks, and it also acts as a bridge with other networks as illustrated below. When a request is sent to an IP address that is outside the current network, then the router of your network seeks the router in charge of that other network to hand over the information.

IPs are also a form of digital real estate, with the supply extremely limited and under high demand. There are only 4 billion possible IP addresses. A later version of IP called IPv6 tries to resolve this by offering a virtually unlimited amount, but while the technology is there, adoption is another story and is still lagging severely behind, cementing the lack of available IPs as a form of digital scarcity.

DNS, a game of names

Using the IP networking technology, we have a neat way to communicate with another computer as long as there's a cable running between them (or wifi, anachronistically). But for email to work well, there are still a few shortcomings we need to address:

IP addresses could change: As computers and routers are added and removed, new IP addresses are assigned and taken out of circulation as well. Also some routers use dynamic IPs, so that the next time your computer connects to the network it might have a different IP.
We might not know the IP address of the machine we're trying to reach, especially if it's belonging to another organization.
IP addresses are hard to remember and to deal with for us humans. Who even remembers phone numbers anymore?

Remember how we were sending emails earlier using feynman@physics? What if we used "physics" as an easy alias to remember a complicated number such as 45.99.186.20? Then if the IP address changes, we could just say that "physics" now refers to the new address. This naming system could work, but it comes with its own shortcomings:

Two or more machines could choose the same name, which makes it unclear which of these machines is supposed to receive the message. How about two "physics" computers from two different universities?
Machines need to let each other know what their names and addresses are, and to update their peers periodically if the IP address changes.

And thus DNS (Domain Name System) was born! You are familiar with it through internet domains such as google.com, omarkama.li, or when you try to make your own website and find out you need to "buy a domain".

alice.com is much easier to remember than a bunch of numbers that could change at any moment

DNS solves the problems we mentioned earlier by providing:

A common registry of aliases such that you can refer to an IP like 150.24.88.40 by just using "google.com" instead.
A way to automatically and economically update this registry to all users of the network, that can operate at the size of the Internet
A way to distinguish between two names, by chaining domain names together. Instead of using the ambiguous "physics" as a domain, now you can specifiy "physics.princeton.edu" or "physics.stanford.edu". You can even go further and keep chaining domains for more specifity, for example "hep.labs.cern.edu". This is referred to as subdomains.

Easy peasy, how did we even live before this?

Now you can contact [email protected] and behind the scenes, your computer will make a request to a DNS server asking it for the IP of the computer behind the alias "physics.princeton.edu". Once your computer gets the IP, it sends the mail which will be relayed to destination by various routers sitting between you and the computer you're communicating with.

What's next?

In a future article, we'll answer questions such as who owns DNS, how DNS affects your gaming and Netflix experience, what you need to know to get your own domain, and many more questions.

Make sure to subscribe to my monthly newsletter to be informed when new articles are released, and feel free to contact me with suggestions, feedback, or topics you'd like me to talk about.

[1] https://www.strategyanalytics.com/access-services/devices/connected-home/consumer-electronics/reports/report-detail/global-connected-and-iot-device-forecast-update

[2] https://www.researchgate.net/figure/Estimated-Number-of-Connected-Devices-Per-Person-By-2025_fig4_322050538