What is HTTP?
HTTP is Hyper Text Transfer Protocol. What is a protocol? Protocol is a set of rules that tells two machines how to talk to each other (similar to the rules in our language).
HTTP is stateless protocol, that means it doesn’t maintain a state after each request. This also results in HTTP being connectionless. That means the client (your browser/machine) and the server is only aware of each other during one/current request.
To fix this, HTTP operates over TCP (Transmission Control Protocol) which opens and keeps the connection alive.
HTTP/1.0 → A new TCP connection was required for each request/response pair. That leads to poor performance because it is time and resource expensive to create a TCP connection for each request.
HTTP/1.1 (persistent connections are default) → We still have to wait for a response before sending a request but we can have multiple request/reponse pairs in a single connection.
Also, we can open multiple connections but most browsers support up to 6 parellel connections per domain.
To fix this limitation, a technique called domain sharding is used where resources are delivered from multiple subdomains. For example, you may have seen images being served from i1.wp.com, i2.wp.com on WordPress.com.
HTTP/1.1 (with pipelining) → Multiple requests can be sent without waiting for a response but the responses have to be in order they were requested. This is poorly supported by browsers and servers and is almost never used.
HTTP/2 (multiplexing) → Multiple requests can be sent without waiting for a response and the responses can be in any order. This vastly improves performance (see a demo: here and here). It can also break the responses into smaller items and send them as soon as they are ready.
HTTP/2 (with push) → Let’s say we request index.html, the server can check this file needs few more files like style.css and scripts.js and the server automatically “pushes” these to us. Thus the browser didn’t need to make separate requests for them.
HTTP Connections Analogy
Let’s say you want to order a laptop, a mouse and a keyboard from our imaginary store Anazom. You can only make phone calls to order something and note that making a call is expensive (the less calls you make, the better).
HTTP/1.0 → You make a call. Order a laptop. You get the laptop. The call ends. You make a new call to order mouse. You get the mouse. The call ends. Same thing for keyboard. You make 3 calls here.
HTTP/1.1 (with persistent connection)→ You make a call. Order a laptop. Once you get the laptop, you order a mouse. Once you get the mouse, you order a keyboard. The call ends.
You make 1 call here but wait for each order to complete before making another order.
HTTP/1.1 (with pipelining) → You make a call. Order a laptop, mouse and keyboard. But they have to arrive in the same order. If 3rd order is ready and 2nd order isn’t complete, it waits. The entire connection remains FIFO (first in first out) and can lead to HOL (head of line) blocking.
You make 1 call here and send all orders at the same time but the delivery of each order is blocked till all the previous orders are delivered.
HTTP/2 (with multiplexing) → You make a call. Order a laptop, mouse, keyboard. They don’t have to arrive in the same order. Whichever order is ready will be sent to you asap.
In fact, if the store determines the keys in the keyboard are ready but rest of the keyboard isn’t. It’ll send keys separately first and then the rest of the keyboard. This isn’t practical in the real world but works very well in computing world. This means your second order was broken into smaller orders and sent back (reducing wait time).
HTTP/2 (with push) → You make a call. Order a laptop. With push, the store can read your first order and understand that you are about to ask for a mouse and a keyboard. So it sends a laptop, a mouse and a keyboard without you making two separate requests (for mouse and keyboard).
HTTP/3 → I’m writing this post in Nov 2019. At this point, HTTP/3 is pretty new so I admit I am not completely up to date with this protocol. But based on my limited understanding. HTTP/3 works with something called QUIC (Quick UDP Internet Connections)→ which is basically UDP (User Datagram Protocol) along with congestion control. Again, I don’t know much about this so please Google it if you’re interested to know more.
Reference: https://stackoverflow.com/questions/36517829/what-does-multiplexing-mean-in-http-2
Refer this sweet animation which explains all this beautifully: https://freecontent.manning.com/animation-http-1-1-vs-http-2-vs-http-2-with-push/