Project 4 - TCP part 1

Introduction

The aim of this project is to implement three fundamental parts of the transmission control protocol, TCP,

connection establishment,
interactive data flow, and
connection termination.

Most of the functionality in the previous projects is portable into this project but there is a difference in answer generation. The direct decode-answer chain in project 3 will be replaced by the more abstract mechanism of an answer-chain in this project. The answer-chain implies changes in the IP, LLC, and ethernet layers in order to correspond to the implementation in the TCP layer.

This project is complete when

it is possible to connect to the echo port, number 7, on the server with telnet,
characters typed in the telnet application are echoed back by the server,
it is possible to terminate the connection in an orderly manner,
no memory leaks have been introduced, and
several telnet sessions connected to the echo port are handled by the server.

There is some administration in order to obtain the source code. Refer to the suggested design of the class hierarchy as well as advice on how to approach the completion of the skeleton code. An executable which may be loaded into the ETRAX unit is compiled, linked, and loaded in the same manner as in the previous overview of the system. Finally, you will have to test your solution.

Sequence numbers and flags in TCP

The stream concept is the key in order to understand sequence and acknowledgement numbers in TCP. All data are treated as a stream of bytes. Thus, each byte is tagged by a sequence number and the concept of a packet does not apply to TCP. Instead, the concept of a segment is used which is a subset of the total stream of bytes. Every segment contains the sequence number of the first byte in the segment. The acknowledgement number is transmitted from the receiver of a segment and corresponds to the next sequence number the receiver expects to receive.

In addition to the stream, there are a number of flags in the TCP header. Each segment transmitted may imply an action or a state transition in the TCP state machine by setting certain flags. A few flags are also intended to be received by an application which uses the TCP connection. One or several flags may be turned on in each TCP header. The bit correspondence to the flags in the field of the header are

URG	ACK	PSH	RST	SYN	FIN
5	4	3	2	1	0

FIN, the sender has sent all data and is finished.
SYN, synchronise sequence numbers in order to initiate a connection.
RST, reset the connection.
PSH, the receiver should push the data to the application as soon as possible.
ACK, the acknowledgement number sent is the next sequence number the receiver is expecting.
URG, the segment contains urgent data but this flag will not be used in the project.

The example below of connection establishment, data transfer, and connection termination shows that both the SYN and FIN flags must have a sequence number although the segments do not contain data. An ACK flag is sent with the SYN and FIN flags except in the very first SYN segment since there is no overhead involved in adding it.

Port numbers

Port numbers, [Stevens96] p.12, are an important concept in TCP. A connection is defined by port numbers together with IP addresses. In the figure, two clients have three different active HTTP connections. All connections are with respect to the server port number 80, the HTTP port. In addition, two of the connections are from the same IP address. However, since a running port number is allocated on the client side, each new session is uniquely defined by four parameters, the client IP address and the client port number together with the server IP address and the server port number.

Connection	Client IP address	Client port number	Server IP address	Server port number
1	194.47.61.9	2614	194.47.61.91	80
2	194.47.61.9	2615	194.47.61.91	80
3	194.47.61.14	3011	194.47.61.91	80

Telnet

The telnet protocol is the base in almost all applications utilising the TCP stack. Both FTP, file transfer protocol, SMTP, simple mail transfer protocol, and HTTP, hypertext transfer protocol, have the telnet protocol as a base. In principle, telnet provides a raw data channel between two computers where characters are transmitted without interpretation. The applications use the raw data channel in order to send commands formatted as text strings terminated with the ASCII sequence CR, carriage return, and LF, line feed. It is the responsibility of the applications to accomplish the interpretation of the commands. For example, a log in sequence in FTP might look like

Client: create a connection to port number 21, the FTP port.
Server: replies 220 godzilla ftp server\r\n, where \r is the CR and \n the LF.
Client: 220 implies a connection is established, send USER richard\r\n.
Server: replies 331 Password required\r\n.
Client: 331 implies a password as necessary, send PASS myPassword\r\n. Note that the password the user entered in the ftp application is sent unencrypted by the application.

The commands USER and PASS are exchanged between the ftp applications in this example.

The telnet application may connect to other ports than the default port 23, which is the telnet port. The port number entered at the end of a telnet command becomes the port number the application tries to connect to. For example, the command telnet www.eit.lth.se 80 followed by the command GET /index.html provides an excersise in reading HTML code.

E-mail is sent by the SMTP protocol in the same manner. Thus, it is possible to be one owns mail server and send fake mail by the telnet application.

Details in the implementation

Interaction description

A code example of how a received synchronise segment with the SYN flag set is handled will show how the different instances interact. The code may be used as it is presented in the solution. The files tcp.[hh,cc] contain the design framework.

The decode method called from the underlying IP layer starts execution with an assumption of an established connection. A new connection is created if the assumption was wrong. When a new connection is created the present state in the state machine is LISTEN and the received segment must have the SYN flag set in order to initiate the synchronisation in the connection establishment. Thus, if the SYN flag is set the method Synchronize is executed. Make sure you understand which Synchronize method it is with respect to the present state. A segment without the SYN flag set in the LISTEN state implies an error and the response to an error in the present state is the Kill method.
void
TCPInPacket::decode()
{
// Extract the parameters from the TCP header which define the
// connection.
TCPConnection* aConnection =
         TCP::instance().getConnection(mySourceAddress,
                                       mySourcePort,
                                       myDestinationPort);
if (!aConnection)
{
    // Establish a new connection.
    aConnection =
         TCP::instance().createConnection(mySourceAddress,
                                          mySourcePort,
                                          myDestinationPort,
                                          this);
    if ((aTCPHeader->flags & 0x02) != 0)
    {
      // State LISTEN. Received a SYN flag.
      aConnection->Synchronize(mySequenceNumber);
    }
    else
    {
      // State LISTEN. No SYN flag. Impossible to continue.
      aConnection->Kill();
    }
}
else
{
    // Connection was established. Handle all states.
}
}

The Synchronize method in the TCPConnection is responsible of invoking the corresponding method in the present state.

void
TCPConnection::Synchronize(udword theSynchronizationNumber)
{
myState->Synchronize(this, theSynchronizationNumber);
}
When a new connection is created the present state in the state machine is LISTEN. Since only echoing of incoming data is to be implemented, a check is made of the port number. The connection variables are initiated if the required port corresponds to the ECHO port number, and a segment is sent in reply according to the state diagram. The connection then enters the new state SYN_RCVD in accordance with the state diagram.
The state variables of importance are receiveNext, sendNext and sentUnAcked. The next expected sequence number from the other host is denoted as receiveNext whereas the next sequence number to send is sendNext. The variable sentUnAcked contains the latest sequence number an acknowledgement has been received for. What remains up to sendNext is sent to but not yet acknowledged by the other host.
void
ListenState::Synchronize(TCPConnection* theConnection,
                         udword theSynchronizationNumber)
{
switch (theConnection->myPort)
{
   case 7:
     trace << "got SYN on ECHO port" << endl;
     theConnection->receiveNext = theSynchronizationNumber + 1;
     theConnection->receiveWindow = 8*1024;
     theConnection->sendNext = get_time();
     // Next reply to be sent.
     theConnection->sentUnAcked = theConnection->sendNext;
     // Send a segment with the SYN and ACK flags set.
     theConnection->myTCPSender->sendFlags(0x12);
     // Prepare for the next send operation.
     theConnection->sendNext += 1;
     // Change state
     theConnection->myState = SynRecvdState::instance();
     break;
   default:
     trace << "send RST..." << endl;
     theConnection->sendNext = 0;
     // Send a segment with the RST flag set.
     theConnection->myTCPSender->sendFlags(0x04);
     TCP::instance().deleteConnection(theConnection);
     break;
}
}
In the method sendFlags, a decision must be taken with respect to the dynamic memory allocation strategy in the solution. The details on how to create a TCP segment are left as an excersise. At the end of the method, the TCPSender invokes myAnswerChain->answer(...) in order to pass the TCP segment to the underlying IP layer.
void
TCPSender::sendFlags(byte theFlags)
{
// Decide on the value of the length totalSegmentLength.
// Allocate a TCP segment.
byte* anAnswer = new byte[totalSegmentLength];
// Calculate the pseudo header checksum
TCPPseudoHeader* aPseudoHeader =
    new TCPPseudoHeader(myConnection->hisAddress,
                        totalSegmentLength);
uword pseudosum = aPseudoHeader->checksum();
delete aPseudoHeader;
// Create the TCP segment.
// Calculate the final checksum.
aTCPHeader->checksum = calculateChecksum(anAnswer,
                                           totalSegmentLength,
                                           pseudosum);
// Send the TCP segment.
myAnswerChain->answer(anAnswer,
                        totalSegmentLength);
// Deallocate the dynamic memory
delete anAnswer;
}

The TCPSender in detail

The class TCPSender implements two methods

sendFlags, which sends a segment that contains a TCP header only with flags set, and
sendData, which sends a segment which may contain flag information in addition to the data in the segment.

The previous design where each packet received was acknowledged with a reply does not apply to a TCP connection since its life time is for a longer period. Instead, a design is required where several segments may be sent which are unrelated to any received segment. Thus, all information about the established connection must be remembered by the TCPSender instance.

The proposed solution in this project is to create a chain of all layers in the initial ethernet frame received when the connection is established. The variable myAnswerChain in the TCPSender will be a reference to the IP layer which is responsible for the creation of the TCPInPacket instance. In practice, the class TCPInPacket contains the method

InPacket*
TCPInPacket::copyAnswerChain()
{
return myFrame->copyAnswerChain();
}

which returns the method copyAnswerChain() in the IPInPacket class, and this is what myAnswerChain in the TCPSender is a reference to. Thus, the new design requires a method called copyAnswerChain() in each layer which must be declared and implemented. Start by adding the declaration

InPacket* copyAnswerChain();

in the public section of the classes IPInPacket (ip.hh), LLCInPacket (llc.hh), and EthernetInPacket (ethernet.hh). Continue with the implementation in the IP layer (ip.cc)

InPacket*
IPInPacket::copyAnswerChain()
{
IPInPacket* anAnswerPacket = new IPInPacket(*this);
anAnswerPacket->setNewFrame(myFrame->copyAnswerChain());
return anAnswerPacket;
}

and the link layer (llc.cc)

InPacket*
LLCInPacket::copyAnswerChain()
{
LLCInPacket* anAnswerPacket = new LLCInPacket(*this);
anAnswerPacket->setNewFrame(myFrame->copyAnswerChain());
return anAnswerPacket;
}

and finally in the ethernet layer (ethernet.cc)

InPacket*
EthernetInPacket::copyAnswerChain()
{
EthernetInPacket* anAnswerPacket = new EthernetInPacket(*this);
anAnswerPacket->setNewFrame(NULL);
return anAnswerPacket;
}

The pseudo checksum

The checksum for each TCP segment to be transmitted must be calculated. A pseudo header is added to the ordinary TCP header in the checksum calculation. The description in [Stevens96] in p.144 and p.227 is not detailed enough, whereas RFC793 provides all the necessary information with respect to the pseudo header.

Source IP address (32 bit)
Destination IP address (32 bit)
0x00 (8 bit)	Protocol (8 bit)	Total length (16 bit)

The protocol field is set to the protocol number of TCP. The total length field must be set to the TCP header length plus the data length in byte, and it does not count the 12 byte of the pseudo header.

Sending data segments

The PSH and ACK flags must always be set in a TCP segment to be transmitted containing data.

Memory leaks

A check of memory leaks (dynamically allocated memory not returned to the system) may be accomplished by a printout in the frontpanel class for example. Every time a LED blinks the total amount of memory left in the system may be printed with the statement

cout << "Core " << ax_coreleft_total() << endl;

Big and little endian revisited

Remember, all protocols in TCP/IP are big endian whereas the ETRAX architecture is little endian. Thus, the opposite ordering of bytes in integers are used. The macro HILO solves the translation between the two endian formats. For example,

// 16 bit integers
uword realPacketLength = HILO(anIPHeader->totalLength);
anIPHeader->totalLength = HILO(realPacketLength);

Only 16 bit integers are affected since integers only are interpreted in the sense of big or little endian. Other fields, e.g. the IP address and the ethernet address, are interpreted without being affected by the endian. The corresponding macro handling 32 bit integers is called LHILO. For example,

// 32 bit integers
udword sequence = LHILO(aTCPHeader->sequenceNumber);

No rule without an exception. The function calculateChecksum returns a result which is big endian, according to the network byte order, and should not be altered.

aTCPHeader->checksum =
   calculateChecksum((byte*)aTCPHeader,
                     totalSegmentLength,
                     pseudosum);

Source code, compilation, linking and loading

Make sure your present working directory is ~/kurs/src. Remove the subfolder lab4 if it exists in ~/kurs/src. Copy all files from your solution in project 3 with the command

cp -r lab3 lab4

and change your present working directory into ~/kurs/src/lab4. This project will be an extension of your solution in project 3. Add the skeleton of project 4 to your previous files with the command

cp -r ~inin/kurs/src/lab4/* .

There should be two files in addition to the ones from project 3 in the directory lab4,

tcp.hh, a declaration of the classes in the TCP design, and
tcp.cc, a framework for the partial implementation of the TCP layer.

The target of the make process is defined in the file /kurs/make/lab4/modules. Make sure your present working directory is ~/kurs/make and type the commands:

genmake -no-lint-files lab4, which creates a new makefile for the new target source code,
axmake -clean lab4, which removes all old object files created in previous compilations (very important as the make application could use outdated object files),
axmake lab4, which compiles and links the new target source code.

Testing the solution

The tools in this project are

the program telnet which is to be executed on linus, and one of the Windows machines, and,

the network monitor executing on the Windows-workstation.

The telnet program is used in order to verify the solution

telnet <ETRAX IP-address> 7

where 7 corresponds to the ECHO port number.