We all know the Internet. It’s the network that enables data transfer on a global scale. Its magnitude is staggering: there are about 5 billion users, 200 million active websites, 300 billion emails sent daily and 40 thousand Google searches every second. We access it by different means (WiFi, optical fiber, coaxial cable…), and with various devices (computers, smartphones, smartwatches…).
A network that is able to transfer such a large volume of data so quickly and so reliably over such large distances and over different physical media, is an extremely complex system. How can we ever hope to examine and describe this sprawling network? We first have to figure out the basic physical components of the Internet. Then, we can examine the architecture and the protocols that enable sending, routing and receiving data in the form of packets.
Internet Infrastructure: a Network of Networks
In reality, the Internet is a network of networks. It’s composed of millions of public and private — commercial, academic, governmental — networks.
In order to access the Internet as a private individual, we must go through an Internet Service Provider. This ISP is itself a network: it connects to many homes and companies in a particular geographic area, at a local, regional or sometimes national level. Companies often have their own private network that they hook up to the Internet via their ISP.
To connect their networks to the global Internet, these local ISPs must also pay to access the services of higher level ISPs. The higher level ISPs are the bridge connecting large numbers of lower level ISPs together on a global scale. There are only a dozen or so level 1 ISPs, but hundreds of thousands of lower level ISPs. So ISPs vary a lot in their network coverage: some span entire continents and oceans, while others only cover narrow geographical areas. Often, lower level ISPs also connect their networks together to share their connection to a global level 1 ISP.
Large content providers like Google often have their own networks. They connect to lower level ISPs as much as possible. This avoids the higher costs of high level ISPs and increases connection speeds for their local users.
The Physical Components of the Internet
Any system connected to Internet or any network is a host. We can divide hosts into two categories: clients and servers. A client is often a desktop or laptop computer, a smartphone, a video game console, or, with the emergence of the Internet of Things, a smart baby monitor, a seismic or animal habitat surveillance system, a smartwatch, etc. A server is a much more powerful machine that stores and distributes web pages, streams videos, transfers email, etc.
When a host wants to send data to another, it chops the data up and adds a few header bytes. The result is a group of packets that is sent through the network to the destination host. Upon reception, the receiver must then reassemble the packets and reconstruct the data.
Hosts are linked with different physical media along which the bits of a packet can travel. The medium could be coaxial cables, copper cables, optical fibers, or even radio waves. But packets rarely go directly from one host to another. Most of the time, they need to go through switches and routers. The role of these two devices is to receive incoming packets and transfer them to the appropriate outgoing line. Typically, a switch operates inside a network whereas a router works as an interface between two different networks or sub-networks.
Every host, router and switch is a network node. The consecutive relays that a packet goes through to arrive at its destination is its route.
The Internet Protocol Suite
Data transfer on the Internet is governed by a standardized suite of protocols. They allow varied applications and services like email, the World Wide Web, instant messaging, peer-to-peer file sharing, streaming, videoconferencing, etc.
The Internet protocol suite is often called “TCP/IP” after its two fundamental protocols: TCP (Transmission Control Protocol) and IP (Internet Protocol). But those two are far from the only protocols involved in packet routing! We’ll take a closer look at these protocols later.
In order to be able to accommodate so many different protocols and route the packets over so many different physical media towards so many different networks and hosts, Internet must have a modular architecture. That is, the protocols in its suite must be more or less interchangeable. This is the main reason why the Internet’s protocols are organized in distinct layers.
Understanding Layered Architecture
Let’s explore the idea of layered architecture with a popular analogy: the airline transportation system. It’s a very complex system involving ticketing, baggage checks, airline companies, airplanes, air traffic control towers, and a worldwide airplane routing system…
However, we can easily explain it in terms of the actions we take when we fly. We buy tickets, check our bags, go to the gate for boarding. Then the plane finally takes off and is routed to its destination. Then, when the airplane lands, we disembark, claim our bags and make a futile attempt to demand a refund if the flight was awful. We could even sum up the whole process this way:
In this example, each layer implements a service, a function, at the departure airport as well as the arrival airport. Each layer offers its service and relies on the services of the previous and following layer. They are all essential. However, it is possible to change the protocols of each layer independently. For example, an airport could very well choose to change its gate protocol to board passengers by age instead of by seat. The airport would not have to change anything about the protocols of the baggage or runway layers. All of the airplanes could even be replaced by hot-air balloons without affecting any of the first four airline protocol layers!
The Layers of the Internet Model
Of course, in the airline transport analogy, the traveler is the data. In order to be sent and received correctly, data must go through several protocol layers to be turned into packets and extracted upon arrival:
Internet’s architectural model is organized in a stack of protocols composed of 5 distinct layers: the application layer, the transport layer, the network layer, the link layer, and finally the physical layer. Each host, switch, router and other network component implements at least part of these protocol layers, just as each airport and air-traffic control tower implements the airline system protocol layers in our previous analogy.
Let’s briefly examine each layer in the Internet architectural model to understand their roles.
The application layer is made up of the various protocols of network applications. These applications are the reason computer networks exist at all. Since there is a wide range of various Internet applications, the application layer encompasses a great diversity of protocols. And new protocols are constantly emerging! The following table only lists a fraction of the most common applications and protocols:
|Request and transfer Web pages||HTTP, HTTPS|
|Email transfer and access||SMTP, IMAP|
|Shell remote access with secure connection||SSH|
|Instant message transfer||IRC, XMPP|
|Clock synchronization to local time||NTP|
|Internet domain name translation into IP address||DNS|
|Streaming audio/video||RMTP, HLS, SRT|
|Peer-to-peer file sharing||BitTorrent, eDonkey|
An application-layer protocol is distributed between hosts: the sending hosts application uses the protocol to exchange information packets with the receiving host’s equivalent application. The application layer’s data packet is called a “message”.
The Internet’s transport layer has the essential role of offering communication services to the applications in different hosts. It transports application-layer messages between application endpoints.
In the Internet protocol suite, two protocols dominate this transport layer: TCP and UDP. They both work very differently:
- TCP is a connection-oriented, reliable transport protocol. This means that it ensures that a connection is established between the sender and the receiver before any data is sent. It also makes sure that the receiver got the data, otherwise it sends it again. This way, TCP can guarantee the delivery of messages to the destination’s application layer. TCP also controls the flow of data and can throttle its transmission speed when the network is congested, and it divides long flows of data into smaller segments. Applications that generally use TCP are file, message and Web page transfer services as well as streaming services like Youtube and Netflix.
- UDP offers a much simpler service with a connection-less transmission. It does not regulate traffic or its transmission rate, and does not guarantee message delivery. However, this protocol’s simplicity is useful for rapidly transmitting small quantities of data, for example when a server sends data to several clients, or when data loss is preferable to waiting for it to be transmitted again. DNS, voice over IP and online gaming are typical uses of UDP.
Packets of the transport layer are known as “segments”.
Internet’s network layer is responsible for routing packets, which are called “datagrams” in this layer, from one host to another. Just as we drop a letter with an address into a post office box, the sender’s transport-layer protocol gives a segment as well as a destination address to the network layer. The network layer must then deliver this segment to the receiving host’s transport layer.
IP, the “Internet Protocol” by definition, naturally dominates the protocols in the Internet’s network layer. This protocol is unrivaled and all of the hosts that have a network layer must run it if they wish to be connected to the Internet. IP defines the network-layer datagram and the IP addresses that uniquely identify every connected host. There are two major versions of IP: IPv4 and IPv6. Both perform the same function in slightly differing ways.
However, the network layer also has smaller routing protocols that work with IP. Most often, their function is to determine the path a datagram should take along the way to a destination, as do protocols RIP, OSPF, IS-IS and BGP. Another protocol, ICMP, sends information or error messages, for example when a service or a host is inaccessible.
Despite the fact that the Internet’s network layer is composed of other protocols on top of IP, the whole layer is often simply called IP. This underlines the fact that IP truly is the glue that holds the Internet together.
The link layer’s goal is to move a datagram from one network node (host, router, switch) to the next. At each node, the network layer passes its datagram down to the link layer, which then moves it to the next node along the route. Once arrived, the link layer passes the datagram back up to the network layer.
The link layer protocols are, for the most part, related to the physical transport medium between two nodes in the network. Some of the link-layer protocols are Ethernet, WiFi, DOCSIS and PPP. Some protocols ensure reliable delivery of the packets bits from one node to the next, while others do not.
Since datagrams must often pass through several different physical media on their way between a sender and a receiver, it isn’t uncommon for a datagram to be transported by several different link-layer protocols. For example, a datagram can be handled by Ethernet on the first link (between the first and second node), then PPP for the next three and finally by WiFi for the last link to get to the destination node.
The link-layer packets are called “frames”.
The physical layer is the lowest layer of the network. Its goal is to move each individual bit of the link-layer frames from one node to the next. Naturally, the protocols in this layer are closely linked to the physical medium over which it must move the bits, but also to the link-layer protocols above. For example, Ethernet has several different physical-layer protocols: one for twisted-pair copper wire, another for coaxial cable, yet another for optic fiber, etc. Each of these protocols move a bit across the link in a different way.
Layer Implementation in Different Internet Devices
Let’s study the diagram below and follow the data path between two clients through the protocol layers in the sending and receiving hosts as well as in the intermediary switch and router.
The first thing we might notice is that only the hosts implement all of the Internet protocol layers. The switch only implements two and the router, three. Indeed, the switch doesn’t care about anything other than the two bottom layers. Its goal is simply to transfer packet bits to the next network node. It must implement the link and physical layers because it usually receives a packet over one physical medium (for example, coaxial cable) and must transfer it via a different physical medium (for example, fiber). The router however, cares about the routing of packets to the right address. Therefore, it must implement the network layer as well.
As we’ve seen, packets have different names depending on the layer they come from: message, segment, datagram, or frame. This is because of the data encapsulation concept.
In the diagram above, we can see that in the sending host, the application-layer message (M) is passed down to the transport layer. The transport layer protocol adds a header (Ht) to it. The header contains information that allows the receiving host’s transport layer to know which application it should deliver the packet to and if the packet bits were corrupted along the way. Together, the message and the header are a segment. So the transport layer encapsulates the application-layer message.
Now, it’s the network layer’s turn to encapsulate the transport-layer segment by adding its own header (Hn) to create a datagram. The datagram header contains among other things the addresses of the sender and the receiver. Finally, the datagram is encapsulated with the link-layer header (Hl) to create a frame.
In this way, in each layer, a packet contains two fields: the header field and the payload field. The payload typically being the packet from the layer directly above. Upon arrival at its destination, each layer extracts its packet to transmit its payload to the layer above.
Of course, the encapsulation process can be much more complicated than that. For example, when the amount of data that must be sent is very large, the transport layer must split the message into several packets. And those could also be further split into several datagrams in the network layer! In cases like those, segments will have to be pieced together from their datagrams and the message will have to be reconstructed from its segments upon arrival.
The OSI Model
The Internet’s layered architecture model that we’ve examined above is not the only network model in existence. In the 1970s, when Internet protocols were in their early stages, several models for protocol suites were being developed. Among them, a model in seven layers: the OSI (Open Systems Interconnection) model. Despite the fact that is never completely was adopted for the Internet, this model is still used in textbooks and training courses because of its early impact on network education.
The only real difference between the Internet model and the OSI model is the separation of the application layer:
The two models have 5 layers with very similar functionalities. Let’s take a look at the two extra layers the OSI model proposes: presentation and session.
The presentation layer of the OSI model is meant to provide interpretation services for the exchanged data. For example, it might provide compression and encryption services, or even data description so that the application would not have to worry about the incoming data’s format.
The session layer could provide for data exchange delimiting and synchronization. For example, by offering a way to construct a checkpoint and recovery system.
So are these two layers not important? What if an application needs these services? In the Internet model, that’s the application’s responsibility. If the service is important, the application developer should integrate it into the application itself.
Sources and Further Reading
- Internet World Stats, World Internet Users and 2022 Population Stats [InternetWorldStats]
- Internet Live Stats, Google Search Statistics [InternetLiveStats]
- Kurose, J. F., Ross, K. W., 2013, Computer Networking: A Top Down Approach, Sixth Edition, Chapter 1: Computer Networks and the Internet, pp. 1-82.
- Site du Zéro, Tutoriel : Apprenez le fonctionnement des réseaux TCP/IP [sdz.com]
- Zuckerman E., McLaughlin A., Introduction to Internet Architecture and Institutions [harvard.edu]
- Wikipedia, Internet [Wikipedia]
- Wikipedia, Internet protocol suite [Wikipedia]