Monday, July 30, 2012

Quick Overview Of TCP


Introduction


As previously mentioned on a number of occasions, TCP is one of the two protocols that lives at the Transport layer and is used to carry data from one host to another. What makes TCP so popular is the way it works when sending and receiving data. Unlike UDP, TCP will check for errors in every packet it receives in an endless struggle to avoid data corruption.
Some common protocols that use TCP are: FTP, Telnet, HTTP, HTTPS, DNS, SMTP and POP3. Let's have a closer look at the main characteristics of this wonderful protocol.
When people refer to "TCP/IP" remember that they are talking about a suite of protocols and not just one protocol, like most people think. TCP/IP is not one protocol. Please see the Protocols section for more information.


Main Features

Here are the main features of the TCP that we are going to analyse:
  • Reliable Transport
  • Connection-Oriented
  • Flow Control
  • Windowing
  • Acknowledgements
  • More overhead


Reliable Transport

It's a reliable transport because of the different techniques it uses to ensure that the data received is error free. TCP is a robust protocol used for file transfers where data error is not an option. When you decide to download a 50MB file from a website, you wouldn't want to find out after the download is complete that the file has an error! Even though, in reality, this does happen, it just goes to show that you can't always be perfect with certain things.
This picture shows the TCP header within an ethernet II frame. Right below this you will find our second diagram that zooms into to the TCP header, displaying the field the protocol contains:
tcp-quick-overview-1

  tcp-quick-overview-2The diagram on the left shows the individual breakdown of each field within the TCP header along with its length in bits.
Remember that 8 bits equal to 1 byte.
The most popular fields within the TCP header are the Source Port, Destination Port and Code bits. These Code bits are also known as 'flags'.
The rest of the fields help make sure all TCP segments make it to their destination and are reassembled in the correct order, while at the same time providing an error free mechanism should a few segments go missing and never reach their destination.
Keep in mind that in the pages to follow we will have a detailed look into each available field, for now we are just providing an overview of them.








Connection Oriented

What this basically means is that a connection is established between the two hosts or rather, the two computers, before any data is transferred. When the term "connection is established" is used, this means that both computers know about each other and have agreed on the exchange of data. This is also where the famous 3-way handshake happens. You will find the SYN and ACK bits in the Code bits field which are used to perform the 3-way handshake. Thanks to the 3-way handshake, TCP is connection oriented.
The following diagram explains the procedure of the 3-way handshake:
tcp-quick-overview-3
STEP 1: Host A sends the initial packet to Host B. This packet has the "SYN" bit enabled. Host B receives the packet and sees the "SYN" bit which has a value of "1" (in binary, this means ON) so it knows that Host A is trying to establish a connection with it.
STEP 2: Assuming Host B has enough resources, it sends a packet back to Host A and with the "SYN and ACK" bits enabled (1). The SYN that Host B sends, at this step, means 'I want to synchronise with you' and the ACK means 'I acknowledge your previous SYN request'.
STEP 3: So... after all that, Host A sends another packet to Host B and with the "ACK" bit set (1), it effectively tells Host B 'Yes, I acknowledge your previous request'.
Once the 3-way handshake is complete, the connection is established (virtual circuit) and the data transfer begins.


Flow Control

Flow control is used to control the data flow between the connection. If for any reason one of the two hosts are unable to keep up with the data transfer, it is able to send special signals to the other end, asking it to either stop or slow down so it can keep up.
For example, if Host B was a webserver from which people could download games, then obviously Host A is not going to be the only computer downloading from this webserver, so Host B must regulate the data flow to every computer downloading from it. This means it might turn to Host A and tell it to wait for a while until more resources are available because it has another 20 users trying to download at the same time.
Below is a diagram that illustrates a simple flow control session between two hosts. At this point, we only need to understand the concept of flow control:
tcp-quick-overview-4
Generally speaking, when a machine receives a flood of data too quickly for it to process, it stores it in a memory section called a buffer. This buffering action solves the problem only if the data bursts are small and don't last long.
However, if the data burst continues it will eventually exhaust the memory of the receiving end and that will result in the arriving data being discarded. So in this situation the receiving end will simply issue a "Not ready" or "Stop" indicator to the sender, or source of the flood. After the receiver processes the data it has in its memory, it sends out a "Ready" or "Go" transport indicator and the sending machine receives the "Go" indicator and resumes its transmission.


Windowing

Data throughput, or transfer efficiency, would be low if the transmitting machine had to wait for an acknowledgment after sending each packet of data (the correct term is segment as we will see on the next page). Because there is time available after the sender transmits the data segment and before it finishes processing acknowledgments from the receiving machine, the sender uses the break to transmit more data. If we wanted to briefly define Windowing we could do so by stating that it is the number of data segments the transmitting machine is allowed to send without receiving an acknowledgment for them.
Windowing controls how much information is transferred from one end to the other. While some protocols quantify information by observing the number of packets, TCP/IP measures it by counting the number of bytes.
tcp-quick-overview-5
Let's explain what is happening in the above diagram.
Host B is sending data to Host A, using a window size equal to one. This means that Host B is expecting an "ACK" for each data segment it sends to Host A. Once the first data segment is sent, Host A receives it and sends an "ACK 2" to Host B. You might be wondering why "ACK 2" and not just "ACK"?
The "ACK 2" is translated by Host B to say: 'I acknowledge (ACK) the packet you just sent me and I am ready to receive the second (2) segment'. So Host B gets the second data segment ready and sends it off to Host A, expecting an "ACK 3" response from Host A so it can send the third data segment for which, as the picture shows, it receives the "ACK 3".
However, if it received an "ACK 2" again, this would mean something went wrong with the previous transmission and Host B will retransmit the lost segment. We will see how this works in the Acknowledgments section later on. Let's now try a different Window size to get a better understanding.. let's say 3!
Keep in mind the way the "ACK's" work, otherwise you might find the following example a bit confusing. If you can't understand it, read the previous example again where the Window size was equal to one.
tcp-quick-overview-6
In the above example, we have a window size equal to 3, which means that Host B can send 3 data segments to Host A before expecting an "ACK" back. Host B sends the first 3 segments (Send 1, Send 2 and Send 3), Host A receives them all in good condition and then sends the "ACK 4" to Host B. This means that Host A acknowledged the 3 data segments Host B sent and awaits the next data segments which, in this case, would be 4, 5 and 6.


Acknowledgments

Reliable data delivery ensures the integrity of a stream of data sent from one machine to the other through a fully functional data link. This guarantees the data won't be duplicated or lost. The method that achieves this is known as positive acknowledgment with retransmission. This technique requires a receiving machine to communicate with the transmitting source by sending an acknowledgment message back to the sender when it receives data. The sender documents each segment it sends and waits for this acknowledgment before sending the next segment. When it sends a segment, the transmitting machine starts a timer and retransmits if it expires before an acknowledgment is returned from the receiving end.
tcp-quick-overview-7
This figure shows how the Acknowledgments work. If you examine the diagram closely you will see the window size of this transfer which is equal to 3. At first, Host B sends 3 data segments to Host A and they are received in perfect condition so, based on what we learned, Host A sends an "ACK 4" acknowledging the 3 data segments and requesting the next 3 data segments which will be 4, 5, 6. As a result, Host B sends data segments 4, 5, 6 but 5 gets lost somewhere along the way and Host A doesn't receive it so, after a bit of waiting, it realises that 5 got lost and sends an "ACK 5" to Host B, indicating that it would like data segment 5 retransmitted. Now you see why this method is called "positive acknowledgment with retransmission".
At this point Host B sends data segment 5 and waits for Host A to send an "ACK" so it can continue sending the rest of the data. Host A receives the 5th data segment and sends "ACK 7" which means 'I received the previous data segment, now please send me the next 3'. The next step is not shown on the diagram but it would be Host B sending data segments 7, 8 and 9.


More Overhead

As you can see, there is quite a neat mechanism under the TCP hood that enables data to be transferred error free. All the features the protocol supports come at a price, and this is the overhead associated with TCP.
When we talk about overhead, we are referring to all the different fields contained within the TCP header and error checking that takes place to ensure no portion of the data is corrupt. While for most this is a fair trade off, some people simply can't spare the extra processing power, bandwidth and increased time the TCP transactions require, for this reason we have the alternative UDP protocol, which you can read about in the UDP protocol section.
At this point our quick overview of the TCP has reached its conclusion. From the next page onwards, we start to dive in deeper, so take a deep breath and jump right into it!
 

No comments:

Post a Comment