Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1736417
  • 博文数量: 438
  • 博客积分: 9799
  • 博客等级: 中将
  • 技术积分: 6092
  • 用 户 组: 普通用户
  • 注册时间: 2012-03-25 17:25
文章分类

全部博文(438)

文章存档

2019年(1)

2013年(8)

2012年(429)

分类: 系统运维

2012-06-20 16:51:01

FTP is another commonly used application. It is the Internet standard for file transfer. We must be careful to differentiate between file transfer, which is what FTP provides, and file access, which is provided by applications such as NFS (Sun's Network File System). The file transfer provided by FTP copies a complete file from one system to another system. To use FTP we need an account to login to on the server, or we need to use it with a server that allows anonymous FTP.

Like Telnet, FTP was designed from the start to work between different hosts, running different operating systems, using different file structures, and perhaps different character sets. Telnet, however, achieved heterogeneity by forcing both ends to deal with a single standard: the NVT using 7-bit ASCII. FTP handles all the differences between different systems using a different approach. FTP supports a limited number of file types (ASCII, binary, etc.) and file structures (byte stream or record oriented).

RFC 959 [Postel and Reynolds 1985] is the official specification for FTP. This RFC contains a history of the evolution of file transfer over the years.


FTP Protocol

FTP differs from the other applications that we've described because it uses two TCP connections to transfer a file.

1. The control connection is established in the normal client-server fashion. The server does a passive open on the well-known port for FTP (21) and waits for a client connection. The client does an active open to TCP port 21 to establish the control connection. The control connection stays up for the entire time that the client communicates with this server. This connection is used for commands from the client to the server and for the server's replies.

The IP type-of-service for the control connection should be "minimize delay" since the commands are normally typed by a human user.

2. A data connection is created each time a file is transferred between the client and server. (It is also created at other times) The IP type-of-service for the data connection should be "maximize throughput" since this connection is for file transfer.


The interactive user normally doesn't deal with the commands and replies that are exchanged across the control connection. Those details are left to the two protocol interpreters. It is the two protocol interpreters that invoke the two data transfer functions, when necessary.

Data Representation

Numerous choices are provided in the FTP protocol specification to govern the way the file is transferred and stored. A choice must be made in each of four dimensions.

1. File type.

a. ASCII file type.

(Default) The text file is transferred across the data connection in NVT ASCII. This requires the sender to convert the local text file into NVT ASCII, and the receiver to convert NVT ASCII to the local text file type. The end of each line is transferred using the NVT ASCII representation of a carriage return, followed by a linefeed. This means the receiver must scan every byte, looking for the CR, LF pair.

b. EBCDIC file type.

An alternative way of transferring text files when both ends are EBCDIC systems.

c. Image file type. (Also called binary.)

The data is sent as a contiguous stream of bits. Normally used to transfer binary files.

d. Local file type.

A way of transferring binary files between hosts with different byte sizes. The number of bits per byte is specified by the sender. For systems using 8-bit bytes, a local file type with a byte size of 8 is equivalent to the image file type.


2. Format control. This choice is available only for ASCII and EBCDIC file types.

a. Nonprint.

(Default) The file contains no vertical format information.

b. Telnet format control.

The file contains Telnet vertical format controls for a printer to interpret.

c. Fortran carriage control.

The first character of each line is the Fortran format control character.


3. Structure.

a. File structure.

(Default) The file is considered as a contiguous stream of bytes. There is no internal file structure.

b. Record structure.

This structure is only used with text files (ASCII or EBCDIC).

c. Page structure.

Each page is transmitted with a page number to let the receiver store the pages in a random order. Provided by the TOPS-20 operating system. (The Host Requirements RFC recommends against implementing this structure.)


4. Transmission mode. This specifies how the file is transferred across the data connection.

a. Stream mode.
(Default) The file is transferred as a stream of bytes. For a file structure, the end-of-file is indicated by the sender closing the data connection. For a record structure, a special 2-byte sequence indicates the end-of-record and end-of-file.

b. Block mode.

The file is transferred as a series of blocks, each preceded by one or more header bytes.

c. Compressed mode.

A simple run-length encoding compresses consecutive appearances of the same byte. In a text file this would commonly compress strings of blanks, and in a binary file this would commonly compress strings of 0 bytes. (This is rarely used or supported. There are better ways to compress files for FTP.)


If we calculate the number of combinations of all these choices, there could be 72 different ways to transfer and store a file. Fortunately we can ignore many of the options, because they are either antiquated or not supported by most implementations.

Common Unix implementations of the FTP client and server restrict us to the following choices:

1. Type: ASCII or image.

2. Format control: nonprint only.
3. Structure: file structure only
4. Transmission mode: stream mode only.


This limits us to one of two modes: ASCII or image (binary).

This implementation meets the minimum requirements of the Host Requirements RFC. (This RFC also requires support for the record structure, but only if the operating system supports it, which Unix doesn't.)


FTP Commands

The commands and replies sent across the control connection between the client and server are in NVT ASCII. This requires a CR, LF pair at the end of each line (i.e., each command or each reply).

The only Telnet commands (those that begin with IAC) that can be sent by the client to the server are interrupt process () and the Telnet synch signal ( in urgent mode). These two Telnet commands are used to abort a file transfer that is in progress, or to query the server while a transfer is in progress. Additionally, if the server receives a Telnet option command from the client (WILL, WONT, DO, or DONT) it responds with either DONT or WONT.

The commands are 3 or 4 bytes of uppercase ASCII characters, some with optional arguments. More than 30 different FTP commands can be sent by the client to the server.

Some of the commonly used commands:

Command
Description
ABOR
LIST filelist
PASS password
PORT n1,n2,n3,n4,n5,n6
QUIT
RETR filename
STOP filename
SYST
TYPE type
USER username
abort previous FTP command and any data transfer
list files or directories
password on server
client IP address (nl.n2.n3.n4) and port (n5 x 256 + n6)
logoff from server
retrieve (get) a file
store (put) a file
server returns system type
specify file type: A for ASCII, I for image
usemame on server

Sometimes there is a one-to-one correspondence between what the interactive user types and the FTP command sent across the control connection, but for some operations a single user command results in multiple FTP commands across the control connection.

FTP Replies

The replies are 3-digit numbers in ASCII, with an optional message following the number. The intent is that the software needs to look only at the number to determine how to process the reply, and the optional string is for human consumption. Since the clients normally output both the numeric reply and the message string, an interactive user can determine what the reply says by just reading the string (and not have to memorize what all the numeric reply codes mean).

Each of the three digits in the reply code has a different meaning. (Simple Mail Transfer Protocol, SMTP, uses the same conventions for commands and replies.)

Reply
Description
1yz
Positive preliminary reply. The action is being started but expect another reply before sending another command
2yz
Positive completion reply A new command can be sent
3yz
Positive intermediate reply. The command has been accepted but another command must be sent
4yz
Transient negative completion reply The requested action did not take place, but the error condition is temporary so the command can be reissued later.
5yz
Permanent negative completion reply. The command was not accepted and should not be retried.
X0z
Syntax errors.
x1z
Information.
x2z
Connections. Replies referring to the control or data connections.
x3z
Authentication and accounting. Replies for the login or accounting commands.
x4z
Unspecified.
x5z
Filesystem status.

The third digit gives additional meaning to the error message. For example, here are some typical replies, along with a possible message string.
1. 125Data connection already open; transfer starting.
2. 200 Command OK.
3. 214 Help message (for human user).
4. 331 Username OK, password required.
5. 425 Can't open data connection.
6. 452 Error writing file.
7. 500 Syntax error (unrecognized command).
8. 501 Syntax error (invalid arguments).
9. 502 Unimplemented MODE type.

Normally each FTP command generates a one-line reply.

Connection Management

There are three uses for the data connection.

1. Sending a file from the client to the server.
2. Sending a file from the server to the client.
3. Sending a listing of files or directories from the server to the client.

The FTP server sends file listings back across the data connection, rather than as multiline replies across the control connection. This avoids any line limits that restrict the size of a directory listing and makes it easier for the client to save the output of a directory listing into a file, instead of printing the listing to the terminal.

We've said that the control connection stays up for the duration of the client-server connection, but the data connection can come and go, as required. How are the port numbers chosen for the data connection, and who does the active open and passive open?

First, we said earlier that the common transmission mode (under Unix the only transmission mode) is the stream mode, and that the end-of-file is denoted by closing the data connection. This implies that a brand new data connection is required for every file transfer or directory listing. The normal procedure is as follows:

1. The creation of the data connection is under control of the client, because it's the client that issues the command that requires the data connection (get a file, put a file, or list a directory).

2. The client normally chooses an ephemeral port number on the client host for its end of the data connection.

The client issues a passive open from this port.

3. The client sends this port number to the server across the control connection using the PORT command.

4. The server receives the port number on the control connection, and issues an active open to that port on the client host. The server's end of the data connection always uses port 20.


Connection Management: Default Data Port

If the client does not send a PORT command to the server, to specify the port number for the client's end of the data connection, the server uses the same port number for the data connection that is being used for the control connection. This can cause problems for clients that use the stream mode (which the Unix FTP clients and server always use).


The reason the Host Requirements RFC recommends using the PORT command is to avoid this 2MSL wait between successive uses of a data connection.

Text File Transfer: NVT ASCII Representation or Image?

Transfering files in binary (image file type) instead of ASCII. This helps in two ways.

1. The sender and receiver don't have to look at every byte (a big savings).

2. Fewer bytes are transferred if the host operating system uses fewer bytes for the end-of-line than the 2-byte NVT ASCII sequence (a smaller savings).


Newer Unix clients automatically send a command to see if the server is an 8-bit byte Unix host, and if so, use binary mode for all file transfers, which is more efficient.


Anonymous FTP

Anonymous FTP, when supported by a server, allows anyone to login and use FTP to transfer files.

阅读(1658) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~