分类: 系统运维
2012-06-20 17:43:13
Electronic mail (e-mail) is undoubtedly one of
the most popular applications. [Caceres et al. 1991] shows that about
one-half of all TCP connections are for the Simple Mail Transfer
Protocol, SMTP. (On a byte count basis, FTP connections carry more
data.) [Paxson 1993] found that the average mail message contains around
1500 bytes of data, but some messages contain megabytes of data,
because electronic mail is sometimes used to send files.
Users
deal with a user agent, of which there are a multitude to choose from.
Popular user agents for Unix include MH, Berkeley Mail, Elm, and Mush.
The
exchange of mail using TCP is performed by a message transfer agent
(MTA). The most common MTA for Unix systems is Sendmail. Users normally
don't deal with the MTA. It is the responsibility of the system
administrator to set up the local MTA. Users often have a choice,
however, for their user agent.
SMTP Protocol
The
communication between the two MTAs uses NVT ASCII. Commands are sent by
the client to the server, and the server responds with numeric reply
codes and optional human-readable strings. This is similar to what we
saw with FTP.
There are a small number of commands that the
client can send to the server: less than a dozen. (By comparison, FTP
has more than 40 commands.)
SMTP Commands
The client identifies itself with the HELO command. The argument must be the fully qualified domain name of the client host like sun.tuc.noao.edu.
The MAIL command identifies the originator of the message. e.g. MAIL From:
RCPT,
identifies the recipient. More than one RCPT command can be issued if
there are multiple recipients. E.g. RCPT To:
The
contents of the mail message are sent by the client using the DATA
command. The end of the message is specified by the client sending a
line containing just a period.
QUIT terminates the mail exchange.
The
RSET command aborts the current mail transaction and causes both ends
to reset. Any stored information about sender, recipients, or mail data
is discarded.
The VRFY command lets the client ask the sender to
verify a recipient address, without sending mail to the recipient. It's
often used by a system administrator, by hand, for debugging mail
delivery problems.
The NOOP command does nothing besides force the server to respond with an OK reply code (200).
There
are additional, optional commands. EXPN expands a mailing list, and is
often used by the system administrator, similar to VRFY. Indeed, most
versions of Sendmail handle the two identically.
The TURN command
lets the client and server switch roles, to send mail in the reverse
direction, without having to take down the TCP connection and create a
new one. (Sendmail does not support this command.) There are three other
commands (SEND, SOML, and SAML), which are rarely implemented, that
replace the MAIL command. These three allow combinations of the mail
being delivered directly to the user's terminal (if logged in), or sent
to the recipient's mailbox.
Envelopes, Headers, and Body
Electronic mail is composed of three pieces.
1. The envelope is used by the MTAs for delivery. In our example the envelope was specified by the two SMTP commands:
MAIL From:
RCPT To:
RFC
821 specifies the contents and interpretation of the envelope, and the
protocol used to exchange mail across a TCP connection.
2. Headers are used by the user agents. We saw nine header fields in our example: Received, Message-Id, From, Date, Reply-To, X-Phone, X-Mailer, To, and Subject. Each header field contains a name, followed by a colon, followed by the field value. RFC 822 specifies the format and interpretation of the header fields. (Headers beginning with an X- are user-defined fields. The others are defined by RFC 822.) Long header fields, such as Received in the example, are folded onto multiple lines, with the additional lines starting with white space.
3. The body is the content of the message from the sending user
to the receiving user. RFC 822 specifies the body as lines of NVT ASCII
text. When transferred using the DATA command, the headers are sent
first, followed by a blank line, followed by the body. Each line
transferred using the DATA command must be less than 1000 bytes.
The
user agent takes what we specify as the body, adds some headers, and
passes the result to the MTA. The MTA adds a few headers, adds the
envelope, and sends the result to another MTA.
The term content is often used to describe the combination of headers
and the body. The content is sent by the client with the DATA command.
Relay Agents
The
first line of informational output by a local MTA may be "Connecting to
mailhost via ether." This is because the system may have been
configured to send all nonlocal outgoing mail to a relay machine for
delivery.
This is done for two reasons. First, it simplifies the configuration
of all MTAs other than the relay system's MTA. (Configuring an MTA is
not simple, as anyone who has ever worked with Sendmail can attest to.)
Second, it allows one system at an organization to act as the mail hub,
possibly hiding all the individual systems.
If the host used as
the relay changes in the future, only its DNS name need change-the mail
configuration of all the individual systems does not change.
Most organizations are using relay systems today.
The
local MTA on the sender's host just delivers the mail to its relay MTA.
(This relay MTA could have a hostname of mailhost in the organization's
domain.) This communication uses SMTP across the organization's local
internet. The relay MTA in the sender's organization then sends the mail
to the receiving organization's relay MTA across the Internet. This
other relay MTA then delivers the mail to the receiver's host, by
communication with the local MTA on the receiver's host.
NVT ASCII
One feature of SMTP is that it uses NVT ASCII for everything: the envelope, the headers, and the body.
Retry Intervals
When
a user agent passes a new mail message to its MTA, delivery is normally
attempted immediately. If the delivery fails, the MTA must queue the
message and try again later.
MX Records: Hosts Not Directly Connected to the Internet
In
Chapter 14 we mentioned that one type of resource record in the DNS is
the mail exchange record, called MX records. MX records are used to send
mail to hosts that
are not directly connected to the Internet, but has an MX record that points to a mail forwarder that is on the Internet.
MX Records: Hosts That Are Down
Another use of MX records is to provide an alternative mail receiver when the destination host is down.
VRFY and EXPN Commands
The
VRFY command verifies that a recipient address is OK, without actually
sending mail. EXPN is intended to expand a mailing list, without sending
mail to the list. Many SMTP implementations (such as Sendmail) consider
the two the same, but newer versions of Sendmail do differentiate
between the two.
SMTP Futures
Changes
are taking place with Internet mail. Recall the three pieces that
comprise Internet mail: the envelope, headers, and body. New SMTP
commands are being added that affect the envelope, non-ASCII characters
can be used in the headers, and structure is being added to the body
(MIME).
Envelope Changes: Extended SMTP
RFC
1425 [Klensin et al. 1993a] defines the framework for adding extensions
to SMTP. The result is called extended SMTP (ESMTP). As with other new
features that we've described in the text, these changes are being added
in a backward compatible manner, so that existing implementations
aren't affected.
A client that wishes to use the new features
initiates the session with the server by issuing a EHLO command, instead
of HELO. A compatible server responds with a 250 reply code. This reply
is normally multiline, with each line containing a keyword and an
optional argument. These keywords specify the SMTP extensions supported
by the server. New extensions will be described in an RFC and will be
registered with the IANA. (In a multiline reply all lines except the
last have a hyphen after the numeric reply code. The last line has a
space after the numeric reply code.)
250 reply specifying that the SIZE keyword is supported contains an optional argument. E.g.
ehlo sun.tuc.noao.edu
250-ymir.claremont.edu
250-8BITMIME
250-EXPN
250-HELP
250-XADR
250 SIZE 461544960
Header Changes: Non-ASCII Characters
RFC
1522 [Moore 1993] specifies a way to send non-ASCII characters in RFC
822 message headers. The main use of this is to allow additional
characters in the sender and receiver names, and in the subject. The
header fields can contain encoded words. They have the following format:
=? charset ? encoding ? encoded-text ?=
charset
is the character set specification. Valid values are the two strings
us-ascii and iso-8859-X, where X is a single digit, as in iso-8859-1.
encoding is a single character to specify the encoding method. Two values are supported.
1.
Q encoding means quoted-printable, and is intended for Latin character
sets. Most characters are sent as NVT ASCII (with the high-order bit set
to 0, of course). Any character to be sent whose eighth bit is set is
sent instead as three characters: first the character =, followed by two
hexadecimal digits. Forexample, the character *e* (whose binary 8-bit
value is 0xe9) is sent as the three characters =E9. Spaces are always
sent as either an underscore or the three characters =20. This encoding
is intended for text that is mostly ASCII, with a few special
characters.
2. B means base-64 encoding. Three consecutive bytes
of text (24 bits) are encoded as four 6-bit values. When the number of
characters to encode is not a multiple of three, equal signs are used as
the pad characters.
Body Changes: Multipurpose Internet Mail Extensions (MIME)
We've
said that RFC 822 specifies the body as lines of NVT ASCII text, with
no structure. RFC 1521 [Borenstein and Freed 1993] defines extensions
that allow structure in the body. This is called MIME, for Multipurpose
Internet Mail Extensions.
MIME does not require any of the extensions that we've described
previously in this section (extended SMTP or non-ASCII headers). MIME
just adds some new headers (in accordance with RFC 822) that tell the
recipient the structure of the body The body can still be transmitted
using NVT ASCII, regardless of the mail contents. While some of the
extensions we've just described might be nice to have along with
MIME-the extended SMTP SIZE command, since MIME messages can become
large, and non-ASCII headers-these extensions are not required by MIME.
All that's required to exchange MIME messages with another party is for
both ends to have a user agent that understands MIME. No changes are
required in any of the MTAs.
Mime-Version:
Content-Type:
Content-Transfer-Encoding:
Content-ID:
Content-Description:
As an example, the following two header lines can appear in an Internet mail message:
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII