分类: 服务器与存储
2009-04-14 16:41:39
The visible records that make up the RP66 logical format may be recorded on any number of media, such as magnetic tape, random access files, communication I/O streams, and so forth. Each medium type has a basic storage unit that is identifiable and manageable by people who use the medium. For non-partitionable magnetic tape it is the tape cassette or tape reel. For partitionable magnetic tape it is the tape partition. For random access files it is the file. For communication streams it is the communications session.
The door is left open here on how medium types are identified. A database may be considered distinct from a random access file, allowing a storage unit to be a database (and its visible records the records of the database). The key requirement is that a medium type be readily identifiable by a person using an application that reads RP66 data so that the type can be provided to the application before the storage unit is opened by the application.
Almost always the capacity of a storage unit and the size of a logical file are different. When the storage unit is larger, it is possible to record one or more logical files on a single storage unit. When the logical file is larger, however, more than one storage unit is needed. The one or more storage units a logical file spans constitute part of a storage set.
To identify a storage unit, including its membership and position in a storage set, and to identify the logical format edition of the data recorded in the storage unit, a storage unit label is recorded at the beginning of every storage unit. The storage unit label consists of fixed-format textual information.
A storage set may consist of one storage unit or many storage units. It is particularly important to have a way to identify and sequentially order multiple storage units spanned by a single logical file, and this is one of the purposes of the storage unit label.Another purpose is to provide human-readable information about the storage unit that can be presented using common utilities having no knowledge of RP66.
Each common medium type is governed by industry standards designed to support one or a few specific application data access models that can be presented by the I/O subsystems of different computers. The descriptions of physical bindings provided in this part are given in terms of the data access models rather than in terms of low-level physical implementation details. For each common medium type, a description of the selected data access model is given as medium characteristics, followed by a description of what a storage unit is and how visible records and storage unit labels are recorded on storage units.
The physical binding requirements that are independent of any medium type are specified in 4 and 5. These requirements use the definitions provided in 2.
An access method is said to be record-oriented if each read operation returns a physical record, namely a sequence of data bytes beginning at the current position on the medium, where the number of bytes is determined by the I/O subsystem from an examination of the medium. The reader discovers how many bytes have been read after the read operation completes. Correspondingly, each write operation writes a single physical record of data. Variable-length record access occurs when the writer may specify a different number of bytes for each physical record. Fixed-length record access occurs when the writer specifies a fixed record size once, and every physical record written has this same number of bytes. Some medium standards predefine the available size or sizes of the fixed record.
An access method is byte-oriented if each read operation returns the number of bytes specified by the reader beginning at the current position on the medium. Correspondingly, writers specify how many bytes to write for each write operation.
Some I/O subsystems support multiple access methods for the same medium type. The intent of this standard is that the medium type of the storage unit shall uniquely determine its physical binding. Consequently, only one physical binding per medium type shall be specified here, and a physical binding must use exactly one access method.
Of the data presented to an application by the I/O subsystem, the first data shall be a storage unit label.
For record-oriented access, the first physical record shall contain the storage unit label at its beginning. This record may be longer than the storage unit label; however, any data in this record past the storage unit label has no meaning and shall be ignored.
The remaining data presented to an application consists of a sequence of one or more visible records.
For record-oriented access, each visible record coincides with one physical record so that the length of the visible record is the same as the length of the physical record.
All visible records must be complete except for the last, which may be incomplete. If incomplete, it shall be assumed to be a failed attempt to write a complete visible record and may be ignored.
An incomplete visible record is detected by the fact that its actual size is smaller than the size declared in its header. An incomplete visible record may occur, for example, because a transmission link failed while writing a record. This is detectable by writers, and the failed visible record should be recorded in full on the next storage unit in the storage set (e.g., in another session) if one is used.On the other hand, actual size of a visible record may be different from the size in its header due to media defects (i.e., the header is corrupted). If not at the end of the storage unit, this is detectable as a media defect, and the application must try to recover in the best way it can. If a media defect occurs at the end, an ambiguous situation arises that the standard does not address.
A physical binding shall specify a mechanism to indicate no more visible records in the storage unit.
For example, on magnetic tape the "no more visible records" mechanism is EOD.
Additional storage set requirements are stated in 6.
A storage set shall contain one or more logical files and consist of one or more storage units.
The first logical record segment in a storage set (i.e., in the first visible record of the first storage unit) shall be the first segment of a logical file (see Part 2).
That is, although a storage unit may begin in the middle of a logical file, a storage set may not.
A logical file may be contained in one or more storage units in a storage set. The part of a logical file contained in a single storage unit is called a logical file section. The storage units in a storage set are numbered sequentially (see Table 1). The sequential order of logical file sections shall correspond to the sequential order of the storage units containing them.
That is, if section 1 starts in storage unit 3, then section 2 is in storage unit 4, section 3 in storage unit 5, and so on. If two logical files have sections in the same storage unit, then for one logical file it must be the last section and for the other logical file it must be the first section.Note there is no restriction on mixing medium types in a storage set. Although most storage sets will consist of the same medium types (e.g., all standard tapes or all disk files), it is legal to mix and match, provided the rules on structure hold.
The storage unit label consists of 128 bytes encoded using characters of the ISO 8859-1 character set. Table 2 describes the fields of the storage unit label.
Table 1 - Storage Unit Label Fields | ||
*Note | Field | Size in Bytes |
1 | Storage unit sequence number | 4 |
2 | RP66 version and format edition | 5 |
3 | Storage unit structure | 6 |
4 | Binding edition | 4 |
5 | Maximum visible record length | 10 |
6 | Producer organization code | 10 |
7 | Creation date | 11 |
8 | Serial number | 12 |
9 | reserved | 6 |
10 | Storage set identifier | 60 |
Storage unit sequence number is an integer in the range 1 to 9999 that indicates the order in which the current storage unit occurs in a storage set. The first storage unit of a storage set has sequence number 1, the second 2, and so on. This number is represented using the characters 0 to 9, right justified with leading blanks if needed to fill out the field (no leading zeros). The rightmost character is in byte 4 of the label. A valid value shall be recorded.
RP66 version and format edition consists of the three characters `V2.' representing the current version of this standard, followed by the edition of the logical format (see Part 2) in the range 01 to 99. The logical format edition is repesented using the characters 0 to 9, right justified with a leading zero for numbers less than 10. The character V is in byte 5 of the label. All logical files in the storage unit adhere to RP66 version 2 and to the same or earlier logical format edition. A valid value shall be recorded.
The logical format edition is recorded using the form 01, 02, etc. for consistency with its RP66 version 1 form. Otherwise, leading zeros in numeric values are generally avoided.
Storage unit structure is a name indicating the visible record structure of the storage unit. This name is left-justified with trailing blanks if needed to fill out the field. The leftmost character is in byte 10 of the label. All storage units in the same storage set shall have the same storage unit structure. Options are listed in Table 2. A valid value shall be recorded.
Binding edition is the character B in byte 16 of the label followed by a positive integer in the range 1 to 999 (no leading zeros), left justified with trailing blanks if needed to fill out the field. The integer value corresponds to the edition of this part (Part 3) of the document that describes the physical binding of the logical format to the storage unit. A valid value shall be recorded.
Maximum visible record length is an integer in the range 0 to 4 294 967 294 (232 - 2) indicating the maximum visible record length for the storage unit, or 0 (zero) if undeclared. This number is represented using the characters 0 to 9, right justified, with leading blanks if necessary to fill out the field (no leading zeros). The rightmost character is byte 29 of the label. A valid value or 0 (zero) shall be recorded.
Producer organization code is an integer in the range 0 to 4 294 967 295 (232 - 1) indicating the organization code (see Appendix A) of the storage unit producer. This number is represented using the characters 0 to 9, right justified, with leading blanks if necessary to fill out the field (no leading zeros). The rightmost character is byte 39 of the label. This field may be empty, i.e., may contain all blanks, in which case no storage unit producer is specified.
Creation date is the earliest date that any current information was recorded on the storage unit. The date is represented in the form dd-MMM-yyyy, where yyyy is the year (e.g., 1994), MMM is one of {JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC}, and dd is the day of the month in the range 1 to 31. Days 1 to 9 have one leading blank. The separator is a hyphen (code 4510). This field may be empty, i.e., may contain all blanks, in which case no creation date is specified.
Serial number is an ID used to distinguish the storage unit from other storage units in an archive of an enterprise. The specification and management of serial numbers is delegated to organizations using this standard. This field may be empty, i.e., may contain all blanks, in which case no serial number is specified.
This field is reserved and should be recorded as all blanks (code 3210).
Storage set identifier is a descriptive name for the storage set. Every storage unit in the same storage set shall have the same value for the storage set identifier in its storage unit label. A value may have embedded blanks and is non-blank if at least one character is different from blank (code 3210). This field is intended to distinguish the storage set from other storage sets, but is not required to be unique. A non-blank value shall be recorded.
Although storage unit structure describes visible records, it is considered to be a storage unit attribute (hence is in the storage unit label) since it describes a feature of all visible records in the storage unit.Table 2 describes the storage unit structure options currently defined.
Table 2 - Storage Unit Structure Options | |
*Note | Field |
1 | RECORD |
2 | FIXREC |
3 | RECSTM |
4 | FIXSTM |
Visible records may be of variable length, ranging from the minimum length required to contain one logical record segment to the length specified in the maximum visible record length field of the storage unit label (if not zero). If the maximum visible record length specified is zero, then visible record maximum length is constrained only by the medium's physical binding.
All visible records in the storage unit have the same length, namely that specified in the maximum visible record length field of the storage unit label. Although all storage units in the same storage set must have a FIXREC structure, the maximum visible record length may be different in different storage units. When the FIXREC option is used, then the maximum visible record length field shall not be 0 (zero).
Same as RECORD except with sparse EOFs (tape marks). Applies only to magnetic tape bindings. See "Binding To Magnetic Tapes" on page 8 for how this is used.
Same as FIXREC except with sparse EOFs (tape marks). Applies only to magnetic tape bindings. See "Binding To Magnetic Tapes" on page 8 for how this is used.
Modern tape systems use two basic technologies to achieve very high capacities, helical scan and linear serpentine.In helical scan systems a rotating head scans across the tape at a small angle to achieve high tape-to-head speeds and close track spacing. Examples of helical scan systems include DAT 4mm, Exabyte 8mm, Metrum VHS, STK Redwood 12.5 mm, DD-1 and DD2 18 mm.
In a longitudinal tape system data is recorded on tracks that parallel the edge of the tape using "fixed" heads. Many newer systems allow the heads to move at right angles to the tape to follow narrow tracks.
In a linear serpentine system a small group of heads record data and are then indexed to record tracks in the reverse direction between those already recorded. Examples of linear serpentine systems include Q1C, DLT, IBM 3490, and IBM 3590.
These technologies can be used to store several gigabytes of data per cartridge and also support data transfer rates of many megabytes per second.
Some systems employ some form of tape labelling, file system directories, and bad block identification stored in system areas on tape. Some systems, particularly helical scan systems, support absolute block location. Most systems support some form of high-speed absolute tape file location. Some systems employ a physical beginning of tape mark (PBOT), typically a reflective strip near the beginning of the tape, that indicates the start of the region where data may be recorded. System information, when supported, is recorded after PBOT.
All tape media covered under this binding shall have the following characteristics:
Some tape media covered under this binding may have one or more of the following characteristics:
Typically density can be set by the writing application and is automatically sensed by the I/O subsystem when reading.
No requirements or restrictions are given here for system areas when supported.
Each partition, or each tape if not partitioned, constitutes one storage unit.
The physical record containing the storage unit label shall be followed immediately by EOF.
No EOF shall be written before the storage unit label.
This requirement is a change from binding edition 1 which requires the writing of EOF before the label. On most newer systems an EOF before the first physical block would be the equivalent of an EOD, which would invalidate all subsequent data on the medium. Even on standard 9-track tapes, absence of a leading EOF does not hinder reading or writing.Notice that he storage unit label does not belong to a tape file, since it has no EOF at its beginning.
Other than the physical record containing the storage unit label, every physical record preceding EOD shall contain a visible record.
Since a visible record must coincide with a physical record, this could also be stated, "every physical record shall be a visible record".
A logical file section shall be contained in one tape file.
For RECORD or FIXREC storage unit structure, each non-empty tape file shall contain exactly one logical file section.
For RECSTM or FIXSTM each non-empty tape file shall contain one or more logical file sections.
A storage unit shall be terminated by EOD immediately following the last visible record in the storage unit. Storage unit termination may occur either before or after ETW. However, no tape file shall be started after ETW. There shall be no empty tape file in a storage unit unless it is used as and considered part of the representation of EOD.
Since the physical record containing the storage unit label is not preceded by EOF, it does not belong to a tape file.Only the first and last tape files may contain partial logical files. The last logical file may be continued onto the next storage unit, and the first logical file section may be the continuation of a logical file started on the previous storage unit.
The STM modes allow some added flexibility for tape management of logical files, particularly where many small logical files are stored on a high-capacity tape. Frequently it is useful to organize logical files into groups and use EOFs to do high-speed searching from group to group rather than from file to file. To insist on EOF for every logical file could make sequential searching slow and potentially wasteful, since EOF typically closes out a physical block.
Because of the variety of ways physical records are implemented in relation to physical blocks, the sizing of visible records (hence of physical records) is left as an implementation decision. For efficiency and optimal recovery it is desirable to write physical records so they align with physical blocks. How this is done depends on the specific tape medium. For example, with DD-2 tape a physical block of size 1,199,840 bytes can be subdivided exactly into ten physical records of size 119,984 bytes provided checksums are disabled. If checksums are enabled, an extra 2 bytes per physical record are needed, in which case only nine physical records will fit into a physical block with 119,966 physical block bytes wasted.
A random access file is a named entity for which random byte access is possible. Bytes are ordered sequentially, and an application may position to any byte in the file by specifying its position and then read or write n bytes. All the bytes presented to an application by the I/O subsystem are data bytes. This view of a file corresponds to an implementation of ANSI C I/O accessing the file in binary mode. The maximum size of a file is implementation-dependent.
Most current implementations have a maximum file size of 232 - 1 bytes, the largest integer expressible as a C unsigned long. To support random access, absolute byte position in the file must be representable using a native datatype of the language.
A storage unit is a single complete file.
The first 128 bytes of a storage unit constitute a storage unit label.
The remaining bytes constitute a sequence of visible records. End of storage unit is indicated when there are no more bytes to read from the file.
A peer-to-peer communication stream is a FIFO (first in first out) queue for which sequential access to an ordered sequence of bytes is possible. An application may either write to the FIFO or read from the FIFO, but may not do both.
An application using full duplex transmission facilities to read data from its remote peer and write data back to the same remote peer is reading from one FIFO and writing to a different FIFO over the same communications channel.
A stream is created by a communications session established between two applications and ends when the session ends. A session is defined by the data exchange occurring between the open and close operations provided by the communications protocol.
A storage unit consists of the stream of data bytes exchanged in one direction between applications during one session.
The first 128 bytes of a storage unit constitute a storage unit label.
The remaining bytes constitute a sequence of visible records. End of storage unit is indicated when there are no more bytes to read from the stream.
Some distinction should be made between "file-less" transfer of data in a peer-to-peer exchange vs "remote file access" or "remote tape access" using communication channels. In the latter two cases, a communications session is established between a file or tape server on one end and an application on the other end. The data access services from the server should appear the same as reading a local file or a local tape. Consequently, the remote storage unit should look like one or more files or one or more tapes. This mode allows multiple storage units in one session, for example.However, a true peer-to-peer communications binding will present only one storage unit.