The
specifications (as we know them today) for the Internet Protocol (known as IP) were set down in January 1980 by
Jon Postel and several
others in the document known as
RFC760. IP was based upon earlier
protocols for the
ARPA net and drew heavily from their successes.
The motivation for IP comes from the creation of packet switching
communications networks where were called at the time "catenet" coined
by the Defense Advanced Research Projects Agency in July 1978. With
IP blocks of data (known as datagrams) are sent from machine to machine that are identified by an IP Address. One of the key points here is that the IP Address is a fixed size (4 bytes properly called octets). Another significant advance was the ability for datagrams to be fragmented to be sent through networks that only allowed small packets.
At the IP level, things are simple - IP implements two features:
- Addressing
-
Within the Internet, there are three concepts: names (what is sought),
address (where what is sought is found), and routes (how to get to where
what is sought). Internet Protocol deals with the addresses. These
addresses are a fixed length of 32 bits. It is possible that a single
host may have multiple addresses (and quite common today).
- Fragmentation
-
Fragmentation occurs when datagrams are sent from a network that allows
large packet sizes to a network that requires smaller sizes. It is possible to mark a datagram "don't fragment" which will cause it to never be fragmented. However, if it reaches a network where it would be invalid, the unfragmentable datagram will be discarded.
Fragmentation allows for a datagram to be broken into an arbitrary number
of smaller datagrams that can be reassembled at a later point. To do this,
the identification field of the options is used to mark the fragments and
make certain that they are not reassembled with the wrong datagram.
When fragmentation occurs, new Internet datagrams are created with the same Internet header. The data is fragmented and attached to the new headers. Several fields are set in each header:
- NFB (Number of Fragment Blocks) - how many fragments are there?
- more-fragments - are there any fragments after this?
- length - the length of the data
- offset - how far from the start is this data (the first fragment is '0')
For a datagram to be reassembled, it must have the same identification,
source, destination, and protocol.
The IP level does
not deal with
data reliability,
flow control,
sequencing and other such features. These are left to other protocols
that are implemented on top of IP (such as
TCP/IP or
UDP).
Each packet within the Internet Protocol is independent and unrelated
to any other packet that may be out there on the network or yet to
be sent.
There are 4 parts to the IP service:
- Type of service
- Time to Live
- Options
- Header Checksum
- Type of Service
- This is a selector of the quality of service. The example given
in the RFC provides "Interactive", "Bulk", and "Real Time". This feature
is used by the gateways and routers from network to network to select
the parameters for transmission on the network. More on this below.
- Time to Live
- Often abbreviated TTL, the time to live is set by the sender. As
the datagram proceeds through the network, each place it is processed
re-evaluates the time to live. If the time to live reaches zero before
the packet gets to the destination, it is destroyed.
While this may come
as a surprise to some it makes perfect sense. Consider playing a network
game, such as Diablo. If for some reason, some packets get delayed
from the server to your machine, it would be unfortunate for them to
later catch up when they are no-longer needed. While the program could
probably deal with them (I'm on update #100, this packet is update #70,
thus I will discard it) it just clutters the network to be sending data
that has to get there in 5 seconds to make any sense before it should
be thrown away after those 5 seconds have gone by.
- Options
- These are common control functions that are needed for most
communication protocols. The options include
- Timestamps
- Error reports
- Special Routing
- Header Checksum
- This checksum acts as a verification that the datagram has been sent
correctly. The checksum is a simple "Yes/No" indicator if any errors have
occurred during transmission. This does not provide for correcting those
errors. At any point where the checksum fails to match the data, the
packet may be discarded.
An example of how IP works. Picture two hosts, each on a separate
Local Area Network that are connected via the Internet.
- The sender application prepares its data and calls on its network
code to send a datagram to the receiver.
- The network library looks at the destination for the datagram. It
prepares an IP header and attaches the data to it.
- Because these machines are on separate networks, this IP packet must
be sent the gateway via the LAN (this could be via Novell, Token Ring,
Apple Talk, what have you...). The IP datagram is attached as data to a
local network header. The resulting datagram is sent out on the
local network.
- The datagram arrives at the gateway. The gateway unwraps the local
network header from the IP datagram. The gateway then once again
determines the Internet address that it should be sent to and sends
the datagram to the second gateway.
- Gateway to gateway, this is repeated until the datagram reaches
the last gateway which is connected to the LAN of the destination machine.
- The gateway examines the header, determines the local network address for the destination machine, creates the local network header and sends the datagram onto the local network.
- The destination machine receives the datagram, unwraps the local net header, examines the IP header and passes the data to the appropriate application along with other information such as the source address.
The Internet header is thus:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- Version - 4 bits
- What version of the Internet Protocol is this? The above describes IPv4 (thus the number is 4). There is work being done on IPv6, though it is not accepted universally yet.
- IHL - 4 bits
- Internet Header Length in 32 bit words. In the above diagram, each
32 bit word is one line. For IPv4, the correct minium value is '5'.
- Type of Service - 8 bits
- This information shows the quality of service and type of service provided. While this is not standardized, some networks do regard different types of service in different ways (voice data being more important than routine data for example). This field can be expanded to:
0 1 2 3 4 5 6 7
+-----+-----+-----+-----+-----+-----+-----+-----+
| PRECEDENCE | STRM|RELIABILITY| S/R |SPEED|
+-----+-----+-----+-----+-----+-----+-----+-----+
- Precedence
- 111 - Flash Override
- 110 - Flash
- 11X - Immediate
- 01X - Priority
- 00X - Routine
- STRM (Streaming) - This is used to indicate if the gateways should
expect more packets from this source to this destination at regular
and frequent intervals.
- 1 - Streaming Data
- 0 - Datagram
- Reliability
- 11 - Highest
- 10 - Higher
- 01 - Lower
- 00 - Lowest
- Speed or Reliability? If there is a choice between speed and
reliability and a conflict between them - which is more important?
- 1 - Speed
- 0 - Reliability
- Speed
Some examples include:
- Telnet - Streaming, Normal Reliability, S/R: speed, Fast
- FTP - Streaming, Normal Reliability, S/R: reliable, Normal
- Speech - Streaming, Least Reliability, S/R: speed, ASAP
- Total Length - 16 bits
- The total length of the datagram, in octets (bytes) including the header. The maximum length of a datagram is 64kB. It should be noted that packets of maximum length are rather impractical for most networks. The recommendation (from 1980) is to send packets no larger than 576 octets unless there is a guarantee that the destination network and host is able to handle larger packets. The number 576 comes from a 512 byte packet plus 64 bytes for the header.
- Identification - 16 bits
- This is sent by the sender to allow for assembling fragments of
a datagram.
- Flags - 3 bits
- Bit 0: Reserved, must be 0.
Bit 1: Don't Fragment
Bit 2: More Fragments
- Fragment Offset - 13 bits
- This indicates where the datagram belongs when fragmented in the
larger datagram. This is measured in 8 octet units. The first
fragment of a fragmented datagram has the offset of zero.
- Time to Live (TTL) - 8 bits
- The time to live for a datagram in the network. This value is
measured in seconds. This provides a means for undeliverable packets
to be discarded along with those that have time sensitive information
that have lost their meaning. (This field worries me if Star Trek like
transporters ever work via IP)
- Protocol - 8 bits
- The protocol is the next level of the protocol. It would be
a waist of space to list all 256 possibilities. The two most important
ones are TCP (decimal 6) and UDP (decimal 17). The current
listing of the assigned protocols can be found in RFC1700.
- Header Checksum - 16 bits
- A checksum of the header only. Because some fields may change
(such as the Time to Live) as the datagram moves from gateway to gateway, this checksum needs to be recomputed at each point. For the purposes of calculating the checksum, the checksum field itself is considered to be nulls.
- Source Address - 32 bits
- The IP address of the host that sent the datagram
- Destination Address
- The IP address of the host that the datagram is intended for
- Options - variable
- The options field contains many possibilities for information to be added. Of those, the most amusing includes that of "Security" for the DOD to send information that may be "top secret" through the network and it is believed that other machines would honor this. Of more practical use is the ability to suggest and record routing information, identification of streaming data, returning error codes, and time stamps.
For more information on IP see http://www.faqs.org/rfcs/rfc760.html
and any other RFCS that may follow it. Much of the information above can be found stated in more concise and technical terms within these documents. Diagrams where copied from the above URL.