TCP/IP Protocols (28) -- SMTP: Simple Mail Transfer Protocol-yourtommy-ChinaUnix博客

淘米挣＊博文精选yourtommy.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

yourtommy

博客访问： 1855749
博文数量： 438
博客积分： 9799
博客等级：中将
技术积分： 6092
用户组：普通用户
注册时间： 2012-03-25 17:25

文章分类

全部博文（438）

人工智能（6）
决策支持系统（3）
网络（57）

云计算（9）

Apache（8）

web前端（11）

TCP/IP（29）
数据库（9）

数据库基础（9）
Unix环境高级编程（195）

17-高级IPC（7）

16-网络IPC：套接（8）

15-进程间通信（9）

14-高级I／O（7）

13-守护进程（7）

12-线程控制（11）

11-线程（7）

10-信号（16）

9-进程关系（11）

8-进程控制（16）

7-进程环境（11）

6-系统数据文件和（10）

5-标准I／O库（15）

4-文件与目录（23）

3-文件I/O（15）

2-UNIX标准与实现（10）

1-UNIX系统总览（12）
算法（10）

算法导论（10）
综合（15）

代码之美（11）

性能与优化（4）
Android（12）

Android编程环境（1）

Android 从入门到（11）
高层设计（29）

设计模式Head Fir（21）

重构（8）
编程语言（59）

Java EE（2）

javascript（8）

C语言（6）

Fortran77-90-95（12）

Perl入门（15）

Java编程思想（16）
Linux（41）

makefile（11）

GDB（9）

Linux Shell编程（5）

Linux编辑器（4）

Linux命令（5）

Linux工具（3）

vi（3）

Ubuntu（1）
未分配的博文（2）

文章存档

2019年（1）

2013年（8）

2012年（429）

我的朋友

相关博文

TCP/IP Protocols (28) -- SMTP: Simple Mail Transfer Protocol

分类：系统运维

2012-06-20 17:43:13

Electronic mail (e-mail) is undoubtedly one of the most popular applications. [Caceres et al. 1991] shows that about one-half of all TCP connections are for the Simple Mail Transfer Protocol, SMTP. (On a byte count basis, FTP connections carry more data.) [Paxson 1993] found that the average mail message contains around 1500 bytes of data, but some messages contain megabytes of data, because electronic mail is sometimes used to send files.

Users deal with a user agent, of which there are a multitude to choose from. Popular user agents for Unix include MH, Berkeley Mail, Elm, and Mush.

The exchange of mail using TCP is performed by a message transfer agent (MTA). The most common MTA for Unix systems is Sendmail. Users normally don't deal with the MTA. It is the responsibility of the system administrator to set up the local MTA. Users often have a choice, however, for their user agent.

SMTP Protocol

The communication between the two MTAs uses NVT ASCII. Commands are sent by the client to the server, and the server responds with numeric reply codes and optional human-readable strings. This is similar to what we saw with FTP.

There are a small number of commands that the client can send to the server: less than a dozen. (By comparison, FTP has more than 40 commands.)

SMTP Commands

The client identifies itself with the HELO command. The argument must be the fully qualified domain name of the client host like sun.tuc.noao.edu.

The MAIL command identifies the originator of the message. e.g. MAIL From: .

RCPT, identifies the recipient. More than one RCPT command can be issued if there are multiple recipients. E.g. RCPT To:.

The contents of the mail message are sent by the client using the DATA command. The end of the message is specified by the client sending a line containing just a period.

QUIT terminates the mail exchange.

The RSET command aborts the current mail transaction and causes both ends to reset. Any stored information about sender, recipients, or mail data is discarded.

The VRFY command lets the client ask the sender to verify a recipient address, without sending mail to the recipient. It's often used by a system administrator, by hand, for debugging mail delivery problems.

The NOOP command does nothing besides force the server to respond with an OK reply code (200).

There are additional, optional commands. EXPN expands a mailing list, and is often used by the system administrator, similar to VRFY. Indeed, most versions of Sendmail handle the two identically.

The TURN command lets the client and server switch roles, to send mail in the reverse direction, without having to take down the TCP connection and create a new one. (Sendmail does not support this command.) There are three other commands (SEND, SOML, and SAML), which are rarely implemented, that replace the MAIL command. These three allow combinations of the mail being delivered directly to the user's terminal (if logged in), or sent to the recipient's mailbox.

Envelopes, Headers, and Body

Electronic mail is composed of three pieces.
1. The envelope is used by the MTAs for delivery. In our example the envelope was specified by the two SMTP commands:
MAIL From:
RCPT To:
RFC 821 specifies the contents and interpretation of the envelope, and the protocol used to exchange mail across a TCP connection.

2. Headers are used by the user agents. We saw nine header fields in our example: Received, Message-Id, From, Date, Reply-To, X-Phone, X-Mailer, To, and Subject. Each header field contains a name, followed by a colon, followed by the field value. RFC 822 specifies the format and interpretation of the header fields. (Headers beginning with an X- are user-defined fields. The others are defined by RFC 822.) Long header fields, such as Received in the example, are folded onto multiple lines, with the additional lines starting with white space.

3. The body is the content of the message from the sending user to the receiving user. RFC 822 specifies the body as lines of NVT ASCII text. When transferred using the DATA command, the headers are sent first, followed by a blank line, followed by the body. Each line transferred using the DATA command must be less than 1000 bytes.

The user agent takes what we specify as the body, adds some headers, and passes the result to the MTA. The MTA adds a few headers, adds the envelope, and sends the result to another MTA.

The term content is often used to describe the combination of headers and the body. The content is sent by the client with the DATA command.

Relay Agents

The first line of informational output by a local MTA may be "Connecting to mailhost via ether." This is because the system may have been configured to send all nonlocal outgoing mail to a relay machine for delivery.

This is done for two reasons. First, it simplifies the configuration of all MTAs other than the relay system's MTA. (Configuring an MTA is not simple, as anyone who has ever worked with Sendmail can attest to.) Second, it allows one system at an organization to act as the mail hub, possibly hiding all the individual systems.

If the host used as the relay changes in the future, only its DNS name need change-the mail configuration of all the individual systems does not change.

Most organizations are using relay systems today.

The local MTA on the sender's host just delivers the mail to its relay MTA. (This relay MTA could have a hostname of mailhost in the organization's domain.) This communication uses SMTP across the organization's local internet. The relay MTA in the sender's organization then sends the mail to the receiving organization's relay MTA across the Internet. This other relay MTA then delivers the mail to the receiver's host, by communication with the local MTA on the receiver's host.

NVT ASCII

One feature of SMTP is that it uses NVT ASCII for everything: the envelope, the headers, and the body.

Retry Intervals

When a user agent passes a new mail message to its MTA, delivery is normally attempted immediately. If the delivery fails, the MTA must queue the message and try again later.

MX Records: Hosts Not Directly Connected to the Internet

In Chapter 14 we mentioned that one type of resource record in the DNS is the mail exchange record, called MX records. MX records are used to send mail to hosts that
are not directly connected to the Internet, but has an MX record that points to a mail forwarder that is on the Internet.

MX Records: Hosts That Are Down

Another use of MX records is to provide an alternative mail receiver when the destination host is down.

VRFY and EXPN Commands

The VRFY command verifies that a recipient address is OK, without actually sending mail. EXPN is intended to expand a mailing list, without sending mail to the list. Many SMTP implementations (such as Sendmail) consider the two the same, but newer versions of Sendmail do differentiate between the two.

SMTP Futures

Changes are taking place with Internet mail. Recall the three pieces that comprise Internet mail: the envelope, headers, and body. New SMTP commands are being added that affect the envelope, non-ASCII characters can be used in the headers, and structure is being added to the body (MIME).

Envelope Changes: Extended SMTP

RFC 1425 [Klensin et al. 1993a] defines the framework for adding extensions to SMTP. The result is called extended SMTP (ESMTP). As with other new features that we've described in the text, these changes are being added in a backward compatible manner, so that existing implementations aren't affected.

A client that wishes to use the new features initiates the session with the server by issuing a EHLO command, instead of HELO. A compatible server responds with a 250 reply code. This reply is normally multiline, with each line containing a keyword and an optional argument. These keywords specify the SMTP extensions supported by the server. New extensions will be described in an RFC and will be registered with the IANA. (In a multiline reply all lines except the last have a hyphen after the numeric reply code. The last line has a space after the numeric reply code.)

250 reply specifying that the SIZE keyword is supported contains an optional argument. E.g.

ehlo sun.tuc.noao.edu
250-ymir.claremont.edu
250-8BITMIME
250-EXPN
250-HELP
250-XADR
250 SIZE 461544960

Header Changes: Non-ASCII Characters

RFC 1522 [Moore 1993] specifies a way to send non-ASCII characters in RFC 822 message headers. The main use of this is to allow additional characters in the sender and receiver names, and in the subject. The header fields can contain encoded words. They have the following format:
=? charset ? encoding ? encoded-text ?=
charset is the character set specification. Valid values are the two strings us-ascii and iso-8859-X, where X is a single digit, as in iso-8859-1.

encoding is a single character to specify the encoding method. Two values are supported.
1. Q encoding means quoted-printable, and is intended for Latin character sets. Most characters are sent as NVT ASCII (with the high-order bit set to 0, of course). Any character to be sent whose eighth bit is set is sent instead as three characters: first the character =, followed by two hexadecimal digits. Forexample, the character *e* (whose binary 8-bit value is 0xe9) is sent as the three characters =E9. Spaces are always sent as either an underscore or the three characters =20. This encoding is intended for text that is mostly ASCII, with a few special characters.

2. B means base-64 encoding. Three consecutive bytes of text (24 bits) are encoded as four 6-bit values. When the number of characters to encode is not a multiple of three, equal signs are used as the pad characters.

Body Changes: Multipurpose Internet Mail Extensions (MIME)

We've said that RFC 822 specifies the body as lines of NVT ASCII text, with no structure. RFC 1521 [Borenstein and Freed 1993] defines extensions that allow structure in the body. This is called MIME, for Multipurpose Internet Mail Extensions.

MIME does not require any of the extensions that we've described previously in this section (extended SMTP or non-ASCII headers). MIME just adds some new headers (in accordance with RFC 822) that tell the recipient the structure of the body The body can still be transmitted using NVT ASCII, regardless of the mail contents. While some of the extensions we've just described might be nice to have along with MIME-the extended SMTP SIZE command, since MIME messages can become large, and non-ASCII headers-these extensions are not required by MIME. All that's required to exchange MIME messages with another party is for both ends to have a user agent that understands MIME. No changes are required in any of the MTAs.

Mime-Version:
Content-Type:
Content-Transfer-Encoding:
Content-ID:
Content-Description:

As an example, the following two header lines can appear in an Internet mail message:
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

阅读(2354) | 评论(0) | 转发(0) |

上一篇：TCP/IP Protocols (27) -- FTP: File Transfer Protocol

下一篇：TCP/IP Protocols (30) the end -- Other TCP/IP Applications

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6