分类: 系统运维
2012-05-08 11:15:54
IP
is the workhorse protocol of the TCP/IP protocol suite. All TCP, UDP,
ICMP, and IGMP data gets transmitted as IP datagrams (Figure 1.4). A
fact that amazes many newcomers to TCP/IP, especially those from an
X.25 or SNA background, is that IP provides an unreliable,
connectionless datagram delivery service.
By unreliable we mean
there are no guarantees that an IP datagram successfully gets to its
destination. IP provides a best effort service. When something goes
wrong, such as a router temporarily running out of buffers, IP has a
simple error handling algorithm: throw away the datagram and try to send
an ICMP message back to the source. Any required reliability must be
provided by the upper layers (e.g., TCP).
The term connectionless means that IP does not maintain any state information about successive datagrams. Each
datagram is handled independently from all other datagrams. This also means that IP datagrams can get delivered out of
order. If a source sends two consecutive datagrams (first A, then B) to the same destination, each is routed independently
and can take different routes, with B arriving before A.
IP Header
The format of an IP datagram:
0 | 15 | 16 | 31 | ||
4-bit version | 4-bit head length | 8-bit type of service (TOS) | 16-bit total length (in bytes) | ^ | 20 bytes | v | |
16-bit identification | 3-bit flags | 13-bit fragment offset | |||
8-bit time to live (TTL) | 8-bit protocol | 16-bit header checksum | |||
32-bit source IP address | |||||
32-bit destination IP address | |||||
options (if any) | |||||
data |
The normal size of the IP header is 20 bytes, unless options are present.
The
4 bytes in the 32-bit value are transmitted in the order: bits 0-7
first, then bits 8-15, then 16-23, and bits 24-31 last. This is called
big endian byte ordering, which is the byte ordering required for all
binary integers in the TCP/IP headers as they traverse a network. This
is called the network byte order. Machines that store binary integers in
other formats, such as the little endian format, must convert the
header values into the network byte order before transmitting the data.
The current protocol version is 4, so IP is sometimes called IPv4.
The
header length is the number of 32-bit words in the header, including
any options. Since this is a 4-bit field, it limits the header to 60
bytes. The normal value of this field (when no options are present) is
5.
The type-of-service field (TOS) is composed of a 3-bit
precedence field (which is ignored today), 4 TOS bits, and an unused bit
that must be 0. The 4 TOS bits are: minimize delay, maximize
throughput, maximize reliability, and minimize monetary cost. Only 1 of
these 4 bits can be turned on. If all 4 bits are 0 it implies normal
service.
The
interactive login applications, Telnet and Rlogin, want a minimum delay
since they're used interactively by a human for small amounts of data
transfer. File transfer by FTP, on the other hand, wants maximum
throughput. Maximum reliability is specified for network management
(SNMP) and the routing protocols. Usenet news (NNTP) is the only one
shown that wants to minimize monetary cost.
The TOS feature is
not supported by most TCP/IP implementations today, though newer systems
starting with 4.3BSD Reno are setting it. Additionally, new routing
protocols such as OSPF and IS-IS are capable of making routing decisions
based on this field.
The total length field is the total length of the IP datagram in bytes. Using this field and the header length field, we know where the data portion of the IP datagram starts, and its length. Since this is a 16-bit field, the maximum size of an IP datagram is 65535 bytes. This field also changes when a datagram is fragmented.
Although
it's possible to send a 65535-byte IP datagram, most link layers will
fragment this. Furthermore, a host is not required to receive a datagram
larger than 576 bytes. TCP divides the user's data into pieces, so this
limit normally doesn't affect TCP. With UDP numerous applications (RIP,
TFTP, BOOTP, the DNS, and SNMP) limit themselves to 512 bytes of user
data, to stay below this 576-byte limit. Realistically, however, most
implementations today (especially those that support the Network File
System, NFS) allow for just over 8192-byte IP datagrams.
The
total length field is required in the IP header since some data links
(e.g., Ethernet) pad small frames to be a minimum length. Even though
the minimum Ethernet frame size is 46 bytes, an IP datagram can be
smaller. If the total length field wasn't provided, the IP layer
wouldn't know how much of a 46-byte Ethernet frame was really an IP
datagram.
The identification field uniquely identifies each datagram sent by a host. It normally increments by one each time a datagram is sent.
The time-to-live
field, or TTL, sets an upper limit on the number of routers through
which a datagram can pass. It limits the lifetime of the datagram. It is
initialized by the sender to some value (often 32 or 64) and
decremented by one by every router that handles the datagram. When this
field reaches 0, the datagram is thrown away, and the sender is notified
with an ICMP message. This prevents packets from getting caught in
routing loops forever.
The protocol field identifies which protocol gave the data for IP to send.
The header checksum
is calculated over the IP header only. It does not cover any data that
follows the header. ICMP, IGMP, UDP, and TCP all have a checksum in
their own headers to cover their header and data.
To compute the
IP checksum for an outgoing datagram, the value of the checksum field is
first set to 0. Then the 16-bit one's complement sum of the header is
calculated (i.e., the entire header is considered a sequence of 16-bit
words). The 16-bit one's complement of this sum is stored in the
checksum field. When an IP datagram is received, the 16-bit one's
complement sum of the header is calculated. Since the receiver's
calculated checksum contains the checksum stored by the sender, the
receiver's checksum is all one bits if nothing in the header was
modified. If the result is not all one bits (a checksum error), IP
discards the received datagram. No error message is generated. It is up
to the higher layers to somehow detect the missing datagram and
retransmit.
ICMP, IGMP, UDP, and TCP all use the same
checksum algorithm, although TCP and UDP include various fields from the
IP header, in addition to their own header and data. Since a router
often changes only the TTL field (decrementing it by 1), a router can
incrementally update the checksum when it forwards a received datagram,
instead of calculating the checksum over the entire IP header again.
Every IP datagram contains the source IP address and the destination IP address.
The final field, the options, is a variable-length list of optional information for the datagram. The options currently defined
are:
1 security and handling restrictions (for military applications, refer to RFC 1108 [Kent 1991] for details),
2 record route (have each router record its IP address.),
3 timestamp (have each router record its IP address and time.),
4 loose source routing (specifying a list of IP addresses that must be traversed by the datagram.), and
5 strict source routing (similar to loose source routing but here only the addresses in the list can be traversed.).
These options are rarely used and not all host and routers support all the options.
The options field always ends on a 32-bit boundary. Pad bytes with a value of 0 are added if necessary. This assures that the IP header is always a multiple of 32 bits (as required for the header length field).
IP Routing
Conceptually,
IP routing is simple, especially for a host. If the destination is
directly connected to the host (e.g., a point-to-point link) or on a
shared network (e.g., Ethernet or token ring), then the IP datagram is
sent directly to the destination. Otherwise the host sends the datagram
to a default router, and lets the router deliver the datagram to its
destination. This simple scheme handles most host configurations.
Most
multiuser systems today, including almost every Unix system, can be
configured to act as a router. We can then specify a single routing
algorithm that both hosts and routers can use. The fundamental
difference is that a host never forwards datagrams from one of its
interfaces to another, while a router forwards datagrams. A host that
contains embedded router functionality should never forward a datagram
unless it has been specifically configured to do so.
In our
general scheme, IP can receive a datagram from TCP, UDP, ICMP, or IGMP
(that is, a locally generated datagram) to send, or one that has been
received from a network interface (a datagram to forward). The IP layer
has a routing table in memory that it searches each time it receives a
datagram to send. When a datagram is received from a network interface,
IP first checks if the destination IP address is one of its own IP
addresses or an IP broadcast address. If so, the datagram is delivered
to the protocol module specified by the protocol field in the IP header.
If the datagram is not destined for this IP layer, then (1) if the IP
layer was configured to act as a router the packet is forwarded (that
is, handled as an outgoing datagram as described below), else (2) the
datagram is silently discarded.
Each entry in the routing table contains the following information:
1 Destination IP address. This can be either a complete host address or a network address, as specified by the flag field (described below) for this entry. A host address has a nonzero host ID (Figure 1.5) and identifies one particular host, while a network address has a host ID of 0 and identifies all the hosts on that network (e.g., Ethernet, token ring).
2 IP address of a next-hop router, or the IP address of a directly connected network. A next-hop router is one that is on a directly connected network to which we can send datagrams for delivery. The next-hop router is not the final destination, but it takes the datagrams we send it and forwards them to the final destination.
3 Flags. One flag
specifies whether the destination IP address is the address of a network
or the address of a host. Another flag says whether the next-hop router
field is really a next-hop router or a directly connected interface.
4 Specification of which network interface the datagram should be passed to for transmission.
IP
routing is done on a hop-by-hop basis. As we can see from this routing
table information, IP does not know the complete route to any
destination (except, of course, those destinations that are directly
connected to the sending host). All that IP routing provides is the IP
address of the next-hop router to which the datagram is sent. It is
assumed that the next-hop router is really "closer" to the destination
than the sending host is, and that the next-hop router is directly
connected to the sending host.
IP routing performs the following actions:
1. Search the routing table for an entry that matches the complete destination IP address (matching network ID and host ID). If found, send the packet to the indicated next-hop router or to the directly connected interface (depending on the flags field). Point-to-point links are found here, for example, since the other end of such a link is the other host's complete IP address.
2. Search the routing table for an entry that matches just the destination network ID. If found, send the packet to the indicated next-hop router or to the directly connected interface (depending on the flags field). All the hosts on the destination network can be handled with this single routing table entry All the hosts on a local Ethernet, for example, are handled with a routing table entry of this type. This check for a network match must take into account a possible subnet mask, which we describe in the next section.
3. Search the routing table for an entry labeled "default." If found, send the packet to the indicated next-hop router.
If
none of the steps works, the datagram is undeliverable. If the
undeliverable datagram was generated on this host, a "host unreachable"
or "network unreachable" error is normally returned to the application
that generated the datagram.
The ability to specify a route to a
network, and not have to specify a route to every host, is another
fundamental feature of IP routing. Doing this allows the routers on the
Internet, for example, to have a routing table with thousands of
entries, instead of a routing table with more than one million entries.
Subnet Addressing
All
hosts are now required to support subnet addressing (RFC 950 [Mogul and
Postel 1985]). Instead of considering an IP address as just a network
ID and host ID, the host ID portion is divided into a subnet ID and a
host ID.
This makes sense because class A and class B addresses
have too many bits allocated for the host ID: 2^24-2 and 2^16-2,
respectively. People don't attach that many hosts to a single network.
We subtract 2 in these expressions because host IDs of all zero bits or
all one bits are invalid.
After obtaining an IP network ID of a
certain class from the InterNIC, it is up to the local system
administrator whether to subnet or not, and if so, how many bits to
allocate to the subnet ID and host ID. For example, the internet used in
this text has a class B network address (140.252) and of the remaining
16 bits, 8 are for the subnet ID and 8 for the host ID.
This division allows 254 subnets, with 254 hosts per subnet.
Many
administrators use the natural 8-bit boundary in the 16 bits of a class
B host ID as the subnet boundary. This makes it easier to determine the
subnet ID from a dotted-decimal number, but there is no requirement
that the subnet boundary for a class A or class B address be on a byte
boundary.
Subnetting hides the details of internal network organization (within a company or campus) to external routers.
The
advantage to using a single class B address with 30 subnets, compared
to 30 class C addresses, is that subnetting reduces the size of the
Internet's routing tables. The fact that the class B address 140.252 is
subnetted is transparent to all Internet routers other than the ones
within the 140.252 subnet. To reach any host whose IP address begins
with 140.252, the external routers only need to know the path to the IP
address 140.252.104.1. This means that only one routing table entry is
needed for all the 140.252 networks, instead of 30 entries if 30 class C
addresses were used. Subnetting, therefore, reduces the size of
routing tables.
Subnet Mask
Part
of the configuration of any host that takes place at bootstrap time is
the specification of the host's IP address. Most systems have this
stored in a disk file that's read at bootstrap time.
In addition
to the IP address, a host also needs to know how many bits are to be
used for the subnet ID and how many bits are for the host ID. This is
also specified at bootstrap time using a subnet mask. This mask is a
32-bit value containing one bits for the network ID and subnet ID, and
zero bits for the host ID.
Although IP addresses are normally
written in dotted-decimal notation, subnet masks are often written in
hexadecimal, especially if the boundary is not a byte boundary, since
the subnet mask is a bit mask.
Given its own
IP address and its subnet mask, a host can determine if an IP datagram
is destined for (1) a host on its own subnet, (2) a host on a different
subnet on its own network, or (3) a host on a different network. Knowing
your own IP address tells you whether you have a class A, B, or C
address (from the high-order bits), which tells you where the boundary
is between the network ID and the subnet ID. The subnet mask then tells
you where the boundary is between the subnet ID and the host ID.
ifconfig Command
The ifconfig command
is used to configure or query a network interface for use by TCP/IP
and is normally run at bootstrap time to configure each interface on a
host.
The ifconfig command normally supports other protocol
families (other than TCP/IP) and has numerous additional options. Check
your system's manual for these details.
netstat Command
The netstat(l) command also provides information about the interfaces on a system. The -i flag prints the interface
information, and the -n flag prints IP addresses instead of hostnames.
IP Futures
There are three problems with IP. They are a result of the phenomenal growth of the Internet over the past few years.
1. Over half of all class B addresses have already been allocated. Current estimates predict exhaustion of the class B address space around 1995, if they continue to be allocated as they have been in the past.
2. 32-bit IP addresses in general are inadequate for the predicted long-term growth of the Internet.
3.
The current routing structure is not hierarchical, but flat, requiring
one routing table entry per network. As the number of networks grows,
amplified by the allocation of multiple class C addresses to a site with
multiple networks, instead of a single class B address, the size of the
routing tables grows.
CIDR (Classless Interdomain Routing) proposes a fix to the third problem that will extend the usefulness of the current
version of IP (IP version 4) into the next century.
Four
proposals have been made for a new version of IP, often called IPng,
for the next generation of IP. The May 1993 issue of IEEE Network (vol.
7, no. 3) contains overviews of the first three proposals, along with an
article on CIDR. RFC 1454 [Dixon 1993] also compares the first three
proposals.
1.
SIP, the Simple Internet Protocol. It proposes a minimal set of changes
to IP that uses 64-bit addresses and a different header format. (The
first 4 bits of the header still contain the version number, with a
value other than 4.)
2. PIP. This proposal also uses larger, variable-length, hierarchical addresses with a different header format.
3. TUBA, which stands for "TCP and UDP with Bigger Addresses," is based on the OSI CLNP (Connectionless Network Protocol), an OSI protocol similar to IP. It provides much larger addresses: variable length, up to 20 bytes. Since CLNP is an existing protocol, whereas SIP and PIP are just proposals, documentation already exists on CLNP. RFC 1347 [Gallon 1992] provides details on TUBA. Chapter 7 of [Periman 1992] contains a comparison of IPv4 and CLNP. Many routers already support CLNP, but few hosts do.
4.
TP/IX, which is described in RFC 1475 [Ullmann 1993]. As with SIP, it
uses 64 bits for IP addresses, but it also changes the TCP and UDP
headers: 32-bit port number for both protocols, along with 64-bit
sequence numbers, 64-bit acknowledgment numbers, and 32-bit windows for
TCP.
The first three proposals use basically the same versions of TCP and UDP as the transport layers.
Now IPv6 is chosen.