Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1736282
  • 博文数量: 438
  • 博客积分: 9799
  • 博客等级: 中将
  • 技术积分: 6092
  • 用 户 组: 普通用户
  • 注册时间: 2012-03-25 17:25
文章分类

全部博文(438)

文章存档

2019年(1)

2013年(8)

2012年(429)

分类: 系统运维

2012-06-15 13:16:52

The Domain Name System, or DNS, is a distributed database that is used by TCP/IP applications to map between hostnames and IP addresses, and to provide electronic  mail routing information. We use the term distributed because no single site on the Internet knows all the information. Each site (university department, campus, company, or  department within a company, for example) maintains its own database of information and runs a server program that other systems across the Internet (clients) can  query. The DNS provides the protocol that allows clients and servers to communicate with each other.

From an application's point of view, access to the DNS is through a resolver. On Unix hosts the resolver is accessed primarily through two library functions,    gethostbyname(3) and gethostbyaddr(3), which are linked with the application when the application is built. The first takes a hostname and returns an IP address, and the  second takes an IP address and looks up a hostname. The resolver contacts one or more name servers to do the mapping. The resolver is normally part of the application. It  is not part of the operating system kernel as are the TCP/IP protocols. Another fundamental point is that an application must convert a hostname to an IP address before it can ask TCP to open a connection or send a datagram using UDP. The TCP/IP protocols within the kernel know nothing about the DNS.

DNS Basics

The DNS name space is hierarchical, similar to the Unix filesystem.

Every node has a label of up to 63 characters. The root of the tree is a special node with a null label. Any comparison of labels considers uppercase and lowercase characters the same. The domain name of any node in the tree is the list of labels, starting at that node, working up to the root, using a period ("dot") to separate the labels. (Note that this is different from the Unix filesystem, which forms a pathname by starting at the top and going down the tree.) Every node in the tree must have a unique domain name, but the same label can be used at different points in the tree.

A domain name that ends with a period is called an absolute domain name or a fully qualified domain name (FQDN). An example is sun.tuc.noao.edu.. If the  domain name does not end with a period, it is assumed that the name needs to be completed. How the name is completed depends on the DNS software being  used. If the uncompleted name consists of two or more labels, it might be considered to be complete; otherwise a local addition might be added to the right of  the name. For example, the name sun might be completed by adding the local suffix .tuc.noao.edu.. The top-level domains are divided into three areas:

1. arpa is a special domain used for address-to-name mappings.

2. The seven 3-character domains are called the generic domains. Some texts call these the organizational domains.

3. All the 2-character domains are based on the country codes found in ISO 3166. These are called the country domains, or the geographical domains.

The normal classification of the seven generic domains.
com: commercial organizations
edu: educational institutions
gov: other U.S. governmental organizations
int: international organizations
mil: U.S. military
net: networks
org: other organizations

DNS folklore says that the 3-character generic domains are only for U.S. organizations, and the 2-character country domains for everyone else, but this is false. There are many non-U.S. organizations in the generic domains, and many U.S. organizations in the .us country domain. (RFC 1480 [Cooper and Postel 1993] describes the .us domain in more detail.) The only generic domains that are restricted to the United States are .gov and .mil.

Many countries form second-level domains beneath their 2-character country code similar to the generic domains: .ac.uk, for example, is for academic institutions in the United Kingdom and .co.uk is for commercial organizations in the United Kingdom.

No single entity manages every label in the tree. Instead, one entity (the NIC) maintains a portion of the tree (the top-level domains) and delegates responsibility to others for specific zones.

A zone is a subtree of the DNS tree that is administered separately. A common zone is a second-level domain, noao.edu, for example. Many second-level domains then divide their zone into smaller zones. For example, a university might divide itself into zones based on departments, and a company might divide itself into zones based on branch offices or internal divisions.

If you are familiar with the Unix filesystem, notice that the division of the DNS tree into zones is similar to the division of a logical Unix filesystem into physical disk partitions. Just as we can't tell from where the zones of authority lie, we can't tell from a similar picture of a Unix filesystem which directories are on which disk partitions.

Once the authority for a zone is delegated, it is up to the person responsible for the zone to provide multiple name servers for that zone. Whenever a new  system is installed in a zone, the DNS administrator for the zone allocates a name and an IP address for the new system and enters these into the name  server's database. This is where the need for delegation becomes obvious. At a small university, for example, one person could do this each time a new system was added, but in a large university the responsibility would have to be delegated (probably by departments), since one person couldn't keep up with the work.

A name server is said to have authority for one zone or multiple zones. The person responsible for a zone must provide a primary name server for that zone and one or more secondary name servers. The primary and secondaries must be independent and redundant servers so that availability of name service for the zone isn't affected by a single point of failure.

The main difference between a primary and secondary is that the primary loads all the information for the zone from disk files, while the secondaries obtain all the information from the primary. When a secondary obtains the information from its primary we call this a zone transfer.

When a new host is added to a zone, the administrator adds the appropriate information (name and IP address minimally) to a disk file on the system running the primary. The primary name server is then notified to reread its configuration files. The secondaries query the primary on a regular basis (normally every 3  hours) and if the primary contains newer data, the secondary obtains the new data using a zone transfer.

What does a name server do when it doesn't contain the information requested? It must contact another name server. (This is the distributed nature of the DNS.) Not every name server, however, knows how to contact every other name server. Instead every name server must know how to contact the root name servers. As of April 1993 there were eight root servers and all the primary servers must know the IP address of each root server. (These IP addresses are contained in the primary's configuration files. The primary servers must know the IP addresses of the root servers, not their DNS names.) The root servers then know the name and location (i.e., the IP address) of each authoritative name server for all the second-leveldomains. This implies an iterative process: the requesting name server must contact a root server. The root server tells the requesting server to contact another server, and so on.

You can fetch the current list of root servers using anonymous FTP. Obtain the file netinfo/root-servers.txt from either ftp.rs.internic.net or nic.ddn.mil.

A fundamental property of the DNS is caching. That is, when a name server receives information about a mapping (say, the IP address of a hostname) it caches that information so that a later query for the same mapping can use the cached result and not result in additional queries to other servers.

DNS Message Format


There is one DNS message defined for both queries and responses.

0~1516~31
identificationflags
number of questionsnumber of answer RRs
number of authority RRsnumber of additional RRs
questions
answers
(variable number of resource records)
authority
(variable number of resource records)
additional information
(variable number of resource records)

The message has a fixed 12-byte header followed by four variable-length fields.

The identification is set by the client and returned by the server. It lets the client match responses to requests.

The 16-bit flags field is divided into numerous pieces:

QRopcodeAATCRDRA(zero)rcode
14111134

We'll start at the leftmost bit and describe each field.

1. QR is a 1-bit field: 0 means the message is a query, 1 means it's a response.

2. opcode is a 4-bit field. The normal value is 0 (a standard query). Other values are 1 (an inverse query) and 2 (server status request).

3. AA is a 1-bit flag that means "authoritative answer." The name server is authoritative for the domain in the question section.

4. TC is a 1-bit field that means "truncated." With UDP this means the total size of the reply exceeded 512 bytes, and only the first 512 bytes of the reply was returned.

5. RD is a 1-bit field that means "recursion desired." This bit can be set in a query and is then returned in the response. This flag tells the name server to handle the query itself, called a recursive query. If the bit is not set, and the requested name server doesn't have an authoritative answer, the requested name server returns a list of other name servers to contact for the answer. This is called an iterative query.

6. RA is a 1-bit field that means "recursion available." This bit is set to 1 in the response if the server supports recursion. Most name servers provide recursion, except for some root servers.

7. There is a 3-bit field that must be 0.

8. rcode is a 4-bit field with the return code. The common values are 0 (no error) and 3 (name error). A name error is returned only from an authoritative name server and means the domain name specified in the query does not exist. The next four 16-bit fields specify the number of entries in the four variable-length fields that complete the record. For a query, the number of questions is normally 1 and the other three counts are 0. Similarly, for a reply the number of answers
is at least 1, and the remaining two counts can be 0 or nonzero.

Question Portion of DNS Query Message

The format of each question in the question section
:

0~1516~31
query name
query typequery class

The query name is the name being looked up. It is a sequence of one or more labels. Each label begins with a 1-byte count that specifies the number of bytes that follow. The name is terminated with a byte of 0, which is a label with a length of 0, which is the label of the root. Each count byte must be in the range of 0 to 63, since labels are limited to 63 bytes.

Unlike many other message formats that we've encountered, this field is allowed to end on a boundary other than a 32-bit boundary.


Each question has a query type and each response (called a resource record) has a type. There are about 20 different values, some of which are now obsolete. The query type is a superset of the type: two of the values we show can be used only in questions.

NameNumeric
value
Descriptiontype?query
type?
A
NS
CNAME
PTR
HINFO
MX
1
2
5
12
13
15
IP address
name server
canonical name
pointer record
host info
mail exchange record
*
*
*
*
*
*
*
*
*
*
*
*
AXFR
* or ANY
252
255
request for zone transfer
request for all records

*
*

The most common query type is an A type, which means an IP address is desired for the query name. A PTR query requests the names corresponding to an IP address.

The query class is normally 1, meaning Internet address. (Some other non-IP values are also supported at some locations.)


Resource Record Portion of DNS Response Message

The final three fields in the DNS message, the answers, authority, and additional information fields, share a common format called a resource record or RR.

0~1516~31
domain name
typeclass
time to live
resource data lengthresource data
resource data

The domain name is the name to which the following resource data corresponds. It is in the same format as we described earlier for the query name field.

The type specifies one of the RR type codes. These are the same as the query type values that we described earlier. The class is normally 1 for Internet data.

The time-to-live field is the number of seconds that the RR can be cached by the client. RRs often have a TTL of 2 days.

The resource data length specifies the amount of resource data. The format of this data depends on the type. For a type of 1 (an A record) the resource data is a 4-byte IP address.


The file /etc/resolv.conf contains info like:

nameserver 140.252.1.54
domain tuc.noao.edu

The first line gives the IP address of the name server - the host noao.edu. Up to three nameserver lines can be specified, to provide backup in case one is down or unreachable. The domain line specifies the default domain. If the name being looked up is not a fully qualified domain name (it doesn't end with a period) then the default domain .tuc.noao.edu is appended to the name.


Pointer Queries


A perpetual stumbling block in understanding the DNS is how pointer queries are handled - given an IP address, return the name (or names) corresponding to that address.

When an organization joins the Internet and obtains authority for a portion of the DNS name space, such as noao.edu, they also obtain authority for a portion of the in-addr.arpa name space corresponding to their IP address on the Internet. In the case of noao.edu it is the class B network ID 140.252. The level of the DNS tree beneath in-addr.arpa must be the first byte of the IP address (140 in this example), the next level is the next byte of the IP address (252), and so on. But remember that names are written starting at the bottom of the DNS tree, working upward. This means the DNS name for the host sun, with an IP address of 140.252.13.33, is 33.13.252.140. in-addr.arpa.

We have to write the 4 bytes of the IP address backward because authority is delegated based on network IDs: the first byte of a class A address, the first and second bytes of a class B address, and the first, second, and third bytes of a class C address.

If there was not a separate branch of the DNS tree for handling this address-to-name translation, there would be no way to do the reverse translation other than starting at the root of the tree and trying every top-level domain. This could literally take days or weeks, given the current size of the Internet. The in-addr.arpa  solution is a clever one, although the reversed bytes of the IP address and the special domain are confusing.

Resource Records

We've seen a few different types of resource records (RRs) so far: an IP address has a type of A, and PTR means a pointer query. We've also seen that RRs are what a name server returns: answer RRs, authority RRs, and additional information RRs. There are about 20 different types of resource records. Also, more RR types are being added over time.

Caching

To reduce the DNS traffic on the Internet, all name servers employ a cache. With the standard Unix implementation, the cache is maintained in the server, not the resolver. Since the resolver is part of each application, and applications come and go, putting the cache into the program that lives the entire time the system is up (the name server) makes sense. This makes the cache available to any applications that use the server. Any other hosts at the site that use this name server also share the server's cache.

UDP or TCP

We've mentioned that the well-known port numbers for DNS name servers are UDP port 53 and TCP port 53. This implies that the DNS supports both UDP and TCP. But all the examples that we've watched with tcpdump have used UDP. When is each protocol used and why?

When the resolver issues a query and the response comes back with the TC bit set ("truncated") it means the size of the response exceeded 512 bytes, so only the first 512 bytes were returned by the server. The resolver normally issues the request again, using TCP. This allows more than 512 bytes to be returned.

Since TCP breaks up a stream of user data into what it calls segments, it can transfer any amount of user data, using multiple segments.

Also, when a secondary name server for a domain starts up it performs a zone transfer from the primary name server for the domain. We also said that the secondary queries the primary on a regular basis (often every 3 hours) to see if the primary has had its tables updated, and if so, a zone transfer is performed.

Zone transfers are done using TCP, since there is much more data to transfer than a single query or response. Since the DNS primarily uses UDP, both the resolver and the name server must perform their own timeout and retransmission. Also, unlike many other Internet applications that used UDP (TFTP, BOOTP, and SNMP), which operate mostly on local area networks, DNS queries and responses often traverse wide area networks. The packet loss rate and variability in round-trip times are normally higher on a WAN than a LAN, increasing the importance of a good retransmission and timeout algorithm for DNS clients.


Rlogin

We start an Rlogin client, connecting to an Rlogin server in some other domain.

The following 11 steps take place, assuming none of the information is already cached by the client or server:

1. The client starts and calls its resolver function to convert the hostname that we typed into an IP address. A query of type A is sent to a root server.

2. The root server's response contains the name servers for the server's domain.

3. The client's resolver reissues the query of type A to the server's name server. This query normally has the recursion-desired flag set.

4. The response comes back with the IP address of the server host.

5. The Rlogin client establishes a TCP connection with the Rlogin server. Three packets are exchanged between the client and server TCP modules.

6. The Rlogin server receives the connection from the client and calls its resolver to obtain the name of the client host, given the IP address that the server receives from its TCP. This is a PTR query issued to a root name server. This root server can be different from the root server used by the client in step 1.

7. The root server's response contains the name servers for the client's in-addr.arpa domain.

8. The server's resolver reissues the PTR query to the client's name server.

9. The PTR response contains the FQDN of the client host.

10. The server's resolver issues a query of type A to the client's name server, asking for the IP addresses corresponding to the name returned in the previous step. This may be done automatically by the server's gethostbyaddr function, otherwise the Rlogin server does this step explicitly. Also, the client's name server is often the same as the client's in-addr.arpa name server, but this isn't required.

11. The response from the client's name server contains the A records for the client host. The Rlogin server compares the A records with the IP address from. the client's TCP connection request.

阅读(1334) | 评论(0) | 转发(1) |
给主人留下些什么吧!~~