Get started in web development with me — Part 1: Internet, Web, Hypertext, HTML and URLs

Gabriel Cruz
5 min readJan 9, 2019

I bet you thought I was a master mind of web development that was going to walk you through the incredible world of the internet, teach you about when we only had a couple of dozens of computers on the “Internet” and how I witnessed all that explode into billions of devices worldwide. Well, you guessed about 30% of that right.

Lost in the series? Here are Part 1, Part 2, Part 3, Part 4 and Part 5.

“What am I reading then?”

Well, no. I’m no expert on web development, I don’t even have the slightest experience with that crap. That’s right, I know absolutely nothing about web development, and that’s the thing.

My plan here is to document my learning process (and mainly my mistakes) so you can have a different perspective of it. Sometimes when I’m learning with someone who knows more about the subject than I do I tend to think about things like “why am I so bad at this?”, “should be this be that hard?” or, and this is the worst of them, “this looks so much easier for other people, why is it so hard for me?”.

What we usually don’t realize is that different people have different problems, meaning that you’ll have trouble with things I won’t, and vice-versa. My role here is to be a complete piece of sh*t so you can be sure that if I did it you can do it too.

Who am I?

Okaaaay, I wasn’t entirely honest with you. I’m not a complete useless piece of human garbage. Here’s some of my background (I won’t assume you know any specific stuff, but familiarity with some programming language is advised):

I’m a fourth-year Computer Science student, I have fairly good knowledge of programming languages such as Python and C, I’ve taken courses such as Computer Networks (but didn’t perform well, you’ll see), Object Orientation (that thing with classes and objects), Computer Architecture and Organization, Operating Systems and Parallel Programming.

If you have never heard about any of these things, don’t worry. I’ve never been that great of a student, so my knowledge on all of these topics is VERY shaky, trust me on that.

The boring but important part

Ok great! Shall we…?

I actually know a little bit of this part as it is highly conceptual, but very simple, and, to be honest, it makes a lot of sense to me.

So what is web development? What is the web? What is the Internet?

The Internet is the thing that connects computers and other devices together, the infrastructure. The Web is the thing that relies on the internet so that everybody in the world can use it (World Wide Web, get it?). In the web there are things like Wikipedia, Reddit, Facebook, Google, Amazon. All these websites are linked together.

By the way, what do I mean by linked? Well, it means that a website has a reference, or a hyperlink, to another one. Hyperlinks are those blue underlined words or sentences that get you from one place to another on the web, like this.

Hypertexts and HTML

Documents that have hyperlinks in them are called hypertexts.

Hypertexts are written by us, humans. They’re not zeroes and ones such as binary files. Anyone can write a hypertext using HTML (HyperText Markup Language), here’s an example of an HTML file:

<html>  Here's an example HTML page whoaaaaaaaaaah!  Here's a hyperlink to <a href="google.com"> Google </a></html>

There’s some other stuff you can do with HTML, but I won’t go into a lot of detail. I suggest you skim through some tutorial, learn about tag syntax, anchors and body/head parts of an HTML document. I don’t think we need to be experts on HTML just now.

Okay, so we wrote our first HTML file. It doesn’t look like a Wikipedia page at all! Well, that’s because we’re reading it with the wrong glasses. In other words, we need a program that opens and renders an HTML file correctly: a browser!

Let’s name our file index.html (the .html is the extension of the file, it tells us that this is an HTML document) and open it with our browser, any browser should do, after all this is a very simple HTML file. I’ll tell the browser we’re looking for a file and then type the path to the file on my computer (which is /home/gabriel/html/index.html)

file:///home/gabriel/html/index.html

HTTP and the Web

But wait! That’s very close to what we type when we’re looking for a website. Well, not today, but a couple years ago people would actually type something like http://google.com if they wanted to access Google — today our browsers complete that first part for us. This is not very different from what we typed there, right? They both seem to be something like:

stuff_1://stuff_2

But what is stuff_1 and what is stuff_2 ? Your guess is as good as mine, let’s do some research… Looks like an URL has three main parts:

protocol://hostname/fileinfo

Okay, so it looks like stuff_1 is actually a protocol name (for example, ‘file’ or ‘http’) and stuff_2 is actually two things! But wait a second, when we look up a local file (a file on our computer, or localhost) we don’t need to specify a hostname, and when we look up a website we don’t need to specify a fileinfo. The sons of b***es lied to us! Or did they…?

We know the hostname for our machine is localhost (trust me on this one, I’ll prove it to you in a sec), so let’s try looking up our file using that as the hostname.

file://localhost/home/gabriel/html/index.html

FUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU —

It works! So they didn’t lie to us. Or did they? When I go to google.com my browser tells me it’s accessing the URL https://google.com . A-HA! They lied to us! …or did they? Let’s try to access https://google.com/index.html .

No luck. Dammit.

Is this some crap only Google does? Lets try https://wikipedia.org/index.html .

Nothing. F*ck. What about https://amazon.com/index.html ?

It works!!! I get right to the landing page for amazon, yay!!

So what’s going on? I have no idea, lol. Maybe this is some dark magic involving javascript that we’ll figure out when we get to it.

How did I know that looking for index.html would (probably) work? Every traditional website consists of a landing (or index) page, and almost all of them are called index.html so that your browser knows what file to look up when another one is not specified.

This post is getting too long, I hope the next ones aren’t as long as this since they don’t need all that introductory bull crap. Thanks for reading!

PLEAAAAASE CORRECT ME IF I’M WRONG

This is my first time writing an article so I apologize for any grammar/spelling errors, non-functional pieces of code (perhaps I didn’t specify something because I thought it was unimportant, but it actually was, and everything blew up on your machine). Anyways, feel free to comment if you followed along and something didn’t work so I can improve the post and we can learn together.

--

--

Gabriel Cruz

Computer Science student at University of São Paulo. OSS/Linux enthusiast, trailing spaces serial killer, casual pentester