Networking
This is one of the most important topics you'll ever learn. Once you understand the hardware in your system and the operating system on top of it, next comes the networking, because without networking that very hardware and OS wouldn't be able to communicate with other networks, whether they're remote or local.
Because networking is such a massive topic, you're going to be learning the core fundamentals and then moving on. As time passes, the projects and challenges you'll work on throughout your journey will bring you a deeper understanding and a lot of hands on experience.
Do not skip this topic. In fact, not only should you take this topic slowly, but you might also want to repeat it once or twice to make sure you really get it.
Subjects
This is what we're going to cover:
- Global Organisations
- Switching and Routing
- Network Addresses
- Protocols
Curated Materials
These materials are going to be leaning heavily on Cloudflare's explanations from their Learning Center which is absolutely fantastic. We use their resources to constantly evolve and check our own understanding of existing technologies and concepts, as well as learn about new ones.
Let's get cracking with our first subject.
Um, that's a lot of Cloudflare content, ya know?
Like we said, Cloudflare's learning center offers a lot of excellent content that simply cannot be ignored. They're also an industry giant and know networking inside out, which makes sense when you consider how much of the Internet's traffic goes through Cloudflare's network(s):
Cloudflare is used by 75.6% of all the websites whose reverse proxy service we know. This is 17.7% of all websites.
Global Organisations
You could go the next 20-30 years without ever knowing about these organisations, but you'll be better off for having learned about their existence and what they do. You'll be ahead of your peers knowing who's really keeping the Internet working on a global scale. When researching these organisations, we'll happily admit we learned about new ones too.
Run over the following pages and get a good understanding for the organisations and their focus within the industry.
- International Telecommunication Uniot (ITU)
- Internet Architecture Board (IAB)
- Internet Society
- The Internet Engineering Task Force (IETF)
- Internet Corporation for Assigned Names and Numbers (ICANN)
These are important organisations that do very important, critical work to make sure the Internet operates the way that it does today.
Switching and Routing
If you want to know what the Internet is, you need to understand what a network is first. Think of a network as a group of connected computers that can communicate with each other, moving data between each "node" on the network as users make demands of those nodes (via web browser, module apps, etc.)
Simply put, the Internet is a massive network of networks that are all connected to each other. That's why they call it the "Internet", because it's made of "interconnected networks."
Since computers in different networks can talk to each other thanks to the Internet, people from all over the world can exchange information really fast. Connecting to the Internet incolves wires, cables, radio waves, or other types of network infrastructure. Basically, all data sent over the Internet is broken down into bits, which are then sent through these wires and cables at the speed of light. The more bits that can travel through these cables at once, the faster the Internet connection is, and the more data that connection can transmit or receive.
Switching operates at the Local Area Network (LAN) level. Check out this resource from Cloudflare on the subject:
Routing, on the other hand, operates by enabling traffic to move between LANs. Again, Cloudflare has us covered here with their amazing content:
Network Addresses
Network addresses comes into two parts (that we care about): IP addresses and DNS (Domain Name Service.) We cover DNS below, in Protocols, so let's focus on IP address and IP routing.
The term "IP" stands for Internet Protocol, which is essentially a set of rules that allows devices to communicate over the Internet. With billions of people accessing the Internet daily, unique identifiers are needed to keep track of who is doing what. This is where the Internet Protocol comes in by assigning IP numbers to each device accessing the Internet.
Think of a computer's IP address as the equivalent of a postal address for a house. If someone wants to send a letter to a friend, they need to write the recipient's postal address on the envelope. Without that address, the postman will have no idea where to deliver the letter.
For instance, when a user enters a domain name like google.com into their web browser, this initiates a request to Google's web server for content (like the Google homepage). Upon receiving the request, Google needs to know where to send the website content. The request will contain the user's IP address so that Google can send a response back to the user's device, which will then display the content in the user's web browser.
All this is orchestrated by the Domain Name System, which we cover later on.
Checkout this Cloudflare article:
The article goes into a lot more detail, but this is an important topic. You're going to be working with IPs a lot.
Optional read
To finalise the topics of switching, routing, and network addresses, consider checking out this incredible resource:
Networking Models
Let's wrap all of that up with the following resource that explains how various networing "layers" work, as well as covering two critical networking models: the OSI networking mode and the TCP/IP model.
There models help us diagnose and break down a connection on the networking stack so we can discover potential problems. Most of the time, you'll be dealing with layers 2 and 4 in the TCP/IP model, and layers 3, 4, and 7 in the OSI model.
Protocols
Despite what we've previously covered about switching, routing, IP addresses, and so on, it's protocols that sit on top of all of this and make things work. Put another way: getting the traffic from A to B is all fine, but you have to describe your data in a way that the receiving party at end-point B understands, otherwise what's the point?
Protocols are how we agree upon the exchange of data across computer networks. There a lot of protocols, but we only care about the important ones:
- TCP
- IP
- TCP/IP (they're used in conjunction with each other, but are two separate protocols)
- UDP
- DNS
- HTTP(S)
- ICMP
- DHCP
- NAT
Let's look at some resources for each of these protocols.
Transmission Control Protocol (TCP)
The Transmission Control Protocol is the one you're going to see and hear the most about. It's everywhere. It's extremely popular with most applications because it comes with a lot of guarantees around the data you're sending, such as ensuring it reached the remote location (and having that remote location confirm it got the data.)
Make sure you get a good grasp of this topic.
Internet Protocol (IP)
TCP on its own can't quite get the data to the remote location. For that, we need something that can route the traffic and knows where the servers are the network. Without IP, you won't be able to tell the networking stack in the operating system where your target host is.
TCP/IP
So because TCP on its own cannot route the traffic, and IP on its own cannot encapsulate and control how traffic is moved across a network, we combine the two to get TCP/IP. That's what you're going to see a lot around documentation and other location: TCP/IP.
Together, this combination of protocols allows us to encapsulate data into a stream of packet (TCP), and then send it to a remote host using an efficient path across a network or even many (many) networks (IP).
User Datagram Protocol (UDP)
TCP has a cousin of sorts: UDP. UDP is almost the complete opposite of TCP in that it doesn't provide any guarantees at all except for one: it'll encapsulate your packet and send it off. Whether it gets to its destination or not is none of its concern. All it cares about is encapsulating a sending as quickly as possible - and that's its strong point: speed.
UDP is used heavily in games, especially MMOs, because it can be used to send a lot of traffic very quickly.
Domain Name Service (DNS)
The IP protocol gives us IP address like IPv4 and IPv6, but how can a user be expected to remember those values? Imagine having a conversation with someone and recommending a website to them
"It's so good! Check it out! Just head over to https://142.251.12.138"
That's a Google IP
You'll likely get an HTTPS certification error. Don't worry about, it's just an example.
That's obviously not going to work, so we have DNS. Now we can use human readable, and rememberable, names like
https://upload.academy instead of an IPv4 address (it would be even worse if it was an IPv6 address!)
In AWS, we use Route53 to manage our DNS zones and records. It's a critical service, and AWS extends on it with features that aren't in the DNS spec.
Hypertext Transfer Protocol/Secure (HTTPS)
Once you've got TCP, UDP, IP, and DNS in place, you can start using a combination of those protocols to deliver other protocls that operate at the Application Layer in the networking stack. Remember, that protocols build on each other, so an HTTPS connection looks like this (starting from the bottom):
- IP (Internet Layer)
- TCP (Transport Layer)
- HTTP (Application Layer)
- TLS (Application Layer)
These are "unwrapped" in order to make an HTTPS connection possible, and with that, the delivery of websites, resources from the Internet, etc. This very content you're reading was brought to you by HTTPS (and the other layers below it.)
Internet Control Message Protocol (ICMP)
This is a protocol that operates on the same layer as TCP and UDP. It's a transport protocol itself, and it's used for some pretty interesting things, like reporting errors. What you'll mostly use it for is ping... and that's about it. It's unlikely you'll ever use it for anything else. It's pretty easy to understand.
Dynamic Host Configuration Protocol (DHCP)
Note
The video embedded in the page is good, but goes into a bit too much detail. Definitely more detail than you need to know. If you have the time, however, you can't go (totally) wrong watching it.
Your local system, the one you're using right now, very (very) likely got it IP address from a DHCP server on your local network. When you connected to your ISP's network, they sent you what we call a "residential gateway". These include a DHCP server. Without a DHCP server, you'd have to manually configure every device on your network with a manual IP address.
DHCP is even used in AWS inside of VPCs (we'll cover these later.) When you create an EC2 Instance, it gets a private address inside the VPC from a DHCP server. They're sort of a big deal.
Network Address Translation (NAT)
Believe it or not, but it turns out that we've run out of IPv4 public addresses. We started out with 4.2 billion, but given there are close to seven billion people on the planet, and about three billion are on the Internet, so 4.2 billion turned out to not quite cut it. How do we solve this?
Instead of every device in our home networks, for example, having their own public IP addresses, we give them private, "none routable" addresses (see IP, above) and then we use Network Address Translation (NAT) to convert the private IP into a public one, and back again, as when a device with an internet address needs to talk to another device over the public Internet.
NAT is also heavily used in AWS and other CLoud providers. We'll be using it a lot.
Project
Because this topic is so big, we'll explore a few key projects to help you understand what's going on and hopefully visually see networking in action.
To complete thse projects, you'll need to familiarise yourself (and possibly download) with the following tools:
- Wireshark
- Python 3
curl
Once you have these in place and can use them (and exercise left to you, dear reader), then we can get on with some cool projects.
TCP Sockets
Using Wireshark, tell us what TCP sockets you have open on your local system.
- How many of them are listening for inbound connections?
- How many are outbound and are connected to a remote server?
- Of the sockets you have listening locally, what are the Application Level protocols they're listening for?
Write a short report on how you found the sockets, what tool(s) you used and options, and what they're open for.
Some things can be ignored
Some sockets might have a special meaning, so if you cannot information about the protocol don't worry about it. Just move on.
Remote HTTP Traffic
Now we're going to use a special Python 3 module that will allow us to create a local web server that we can then use to analyse the traffic.
Place the following content in a file called index.html somewhere in an empty folder of your choice:
<html><head><title>Upload Academy Learning</title></head><body><h1>The Answer</h1><p>It's 42.</p></body></html>
Now open a terminal up in the location of the file and run this: python3 -m http.server
Here's how I did this on Linux:
/tmp/server
$ cat > index.html
<html><head><title>Upload Academy Learning</title></head><body><h1>The Answer</h1><p>It's 42.</p></body></html>
/tmp/server
$ python3 -m http.server 9000
Serving HTTP on 0.0.0.0 port 9000 (http://0.0.0.0:9000/) ...
Now I have an HTTP server running locally, but what now?
We want you to analyse the traffic to that server:
- Using Wireshark, filter for HTTP traffic to
localhost:9000(you should know what protocol is being used: TCP or UDP?) - Now use
curlto send simple requests to the server, likecurl http://localhost:9000/index.html- what do you get back? - What happens if you request a missing or nonexistent resource?
- What happens if you add an image to the same directory as
index.htmland then request it? - Use the
-Iflag oncurland explain what you're seeing - break down each header in the repsonse.
Whilst doing all of the above, you should be using Wireshark to analyse and breakdown the traffic you're seeing. Write a small report on the protocols being used for each requets you're making:
- How many packets are being sent?
- What protocol is being used?
- Show use a capture of a single packet and explain some of the details you're seeing
Next
Without the ability to secure networks using CyberSecurity concepts, you'll produce insecure and vulnerable solutions. Let's learn a few key CyberSec concepts so we can prevent that from happening.