Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2097459
  • 博文数量: 414
  • 博客积分: 10312
  • 博客等级: 上将
  • 技术积分: 4921
  • 用 户 组: 普通用户
  • 注册时间: 2007-10-31 01:49
文章分类

全部博文(414)

文章存档

2011年(1)

2010年(29)

2009年(82)

2008年(301)

2007年(1)

分类: C/C++

2008-07-19 16:49:49

DISCUSSION PAPER NO. 91

DATE: December 1, 1995
REVISED:

NAME: Machine Generation Flag in USMARC Authority Records

SOURCE: Cooperative Cataloging Council, Series Authority Record Task Group

SUMMARY: This paper discusses options for flagging USMARC authority records that have been created or modified by machine.

KEYWORDS: Authority format; Field 008/08 (Authority); Field 008/33 (Authority); Level of establishment; Machine-generation flag

STATUS/COMMENTS:

12/01/95 - Forwarded to the USMARC Advisory Group for discussion at the January 1996 MARBI meetings.

1/21/96 - Results of USMARC Advisory Group discussion - Several vendors and networks that did machine generation of authority records indicated that they marked the records with local content designation. It was asked if this was useful for internal systems but unnecessary in the communications environment? Would the value mean different things in different systems as the generation mechanisms may vary, producing authority records with various levels of completeness and standardization? It was indicated that the PCC may establish 3 levels of machine generated records, but they did not request that format to reflect these. The USMARC Advisory Group requested more information from the PCC concerning:

  • how they define machine-generated,
  • uses for the value in the exchange record, and
  • need for permanence of the information in the record.

DISCUSSION PAPER NO. 91: Flagging Machine Generation

1. BACKGROUND

This paper discusses flagging USMARC authority records that have
been created or modified by machine. It presents the rationale
for indicating that an authority record was machine generated and
suggests several options for providing such a flag. Options
suggested include ones that make use of new and existing
authority record data elements. The need for such a flag is an
outgrowth of a national effort to increase the amount of
authority control provided in national data bases.


2. DISCUSSION

In the spring of 1994 the Cooperative Cataloging Council (CCC)
established the Series Authority Record Task Group to define the
content and functional uses of series authority records. The
creation of this group followed a two year period (1993 to 1994)
during which the Library of Congress considered making changes to
the amount of authority work done for monographic series titles.
A September 1994 report from the Task Group made suggestions and
recommendations to the CCC about changes to the series authority
record which it said were needed to support the goals of the
Program for Cooperative Cataloging (PCC). As a result of PCC
Executive Council review of the report at the ALA Midwinter
Conference in 1995, one of the recommendations from the Task
Group was forwarded to MARBI for consideration.

The recommendation forwarded to MARBI was that a data element
should be made available in the USMARC Format for Authority Data
for indicating that an authority record (for a series title or
any other heading type) was initially generated by machine. The
Task Group suggested this because it believed that this
information was important in the context of computers being used
to generate some records in the National Authority File (NAF) so
that all headings used in access points could be under authority
control. In their proposal they suggested that a new code in an
existing fixed-length data element (008/33 (Level of
establishment) could be used.


Current State of Machine Generation

Many library systems already provide for the automatic generation
of authority records for headings in authority controlled fields
in bibliographic records. In most systems with this
functionality, authority records are created for any heading not
already covered by an existing authority record. The content of
machine-generated authority records varies but some systems are
able to create records which contain as much information as a
human would supply when simple headings are involved. Examples
of the kinds of data elements supplied by machine include the 1XX
(Heading) field, 670 (Sources Found Note), and certain control
information. It is even possible for systems to provide some
cross references, although in most case this is left for humans
to provide. Unfortunately, the USMARC Authority format does not
include any data element designed to indicate creation or
manipulation of a MARC record by machine.

Machine generation of authority records offers a means for
libraries to provide full authority control while reducing
individual effort. Both time and cataloging resources can be
saved. Even if an authority record is later updated by a human
to add references and other information, creation of a brief
record by machine from data already keyed in a bibliographic
records avoids rekeying and the cost connected to it. When
multiplied by thousands of headings, the savings can be
significant. Machine generation of an authority records from a
heading in a bibliographic record also guarantees a match between
the two. System validation of headings in a bibliographic files
against an authority file is often part of the process. With the
functionality of library systems expanding, machine generation
and manipulation of authority records is already widely
available.


Task Group Requirements

The flagging of machine-generated records could meet several
requirements. The Task Group suggests that a machine generation
flag is needed for analytical purposes. It would facilitate the
assessment of the effects of machine generation on the overall
character of authority files. If defined adequately, it could
also help to improve software that generates authority records
automatically. A data element to signal machine generation is
essential in order to identify records that have not been
reviewed and updated by a human. In an environment where
authority data is shared, it would allow systems to prioritize
authority records, giving, perhaps, greater value to records
created by human than to those created by machine.

The SAR Task Group is of the opinion that it is important to
identify machine-generated authority records as a distinctive
group. They believe that in the future machine-generated
authority records will reach such a level of sophistication in
production that they will coexist with human-generated records in
resource files including the National Authority File.


3. POSSIBLE OPTIONS

a) Make use of an existing fixed-length USMARC authority
data element by validating a new value. The CCC
recommended using field 008/33 (Level of establishment)
to indicate that the record was machine generated.
This would have the advantage of making use of an
existing data element that could be easily and reliably
coded by machine. The disadvantage of using 008/33 is
that the data element as currently defined relates to
the heading in a 1XX field, not necessarily the entire
record. Even though a record may be machine generated,
the heading might be "fully established" (one of the
other currently-valid code defined for 008/33). The
use of a new code would eliminate the possibility of
also coding one of the other aspects that is handled by
008/33. Field 008/29 (Reference evaluation) might be a
more appropriate data element for which to define a new
code. It is assumed that in the case of
machine-generated records, the need/evaluation of
references is the area where catalogers would be likely
to have the most concern. In most cases, particularly
if the heading field were generated from a
bibliographic record, the 1XX field would be reliably
authoritative.

b) Make use of an existing variable length USMARC data
element. Field 042 (Authentication Code) might be
ideal for this purpose. Since the data in this field
is not often validated, it would result in the least
change needed to implementations of the USMARC
authority format. A special code or codes could be
used to identify the lack of human authentication for
the record. Field 040 (Cataloging Source) could also
be used, although since none of the currently-defined
subfields would be appropriate for a machine-generation
flag, a new subfield would be needed.

c) Define one of the available (undefined) field 008
positions (e.g., position 08) for a machine generation
flag. The advantages to this option are that it does
not confuse or eliminate the coding possible in other
fixed-length data elements or variable fields. As a
separate data element, several values could be defined
to allow the quality/complexity of the machine
generation to be specified more accurately (e.g.,
machine generated 1XX only, or 1XX and 670, or 1XX,
670, and obvious 4XX references based on computer
algorithms). If field 008/08 were undesirable for some
reason, field 008 positions 18-27 and 30 are also
currently undefined.


4. QUESTIONS

The suggestion of defining or identifying an existing USMARC
Authorities data element to flag machine generation raises
several questions.

1) What function would the flag actually serve? Would
USMARC users be likely to really use the information
about machine-generation to some end? Some people
worry that users would be doing a lot of coding that
nobody would make much use of.

2) Would the machine-generation flag be permanent? If
not, changing the flag to some other value would
further burden catalogers who must already update
authority records for other purposes.

3) What assumptions are there behind machine-generated
records? Would a flag such as the one suggested in
this paper imply certain characteristics in the record,
for example, certain fields present, other lacking?

4) Is machine-generation a concern if quality is not
affected? Some have suggested that as many as 50% of
authority records could be machine generated with equal
content and quality because references are not
involved. If this is true, would such record be better
off without the machine generated flag?

5) What is the analysis design behind the CCC request for
a flag for machine generation. What kinds of analyses
are likely to be depended on it?

6) Is there a need to identify what pieces of an authority
record were generated by machine, i.e., at the
field/subfield level? (NOTE: Some cataloging agencies
use a locally defined subfield to indicate machine
manipulation of access points.)

7) What are the implications of the existence of a
machine-generated flag on existing authority files that
contain machine-generated records. None of the options
can deal with the perhaps large number of
machine-generated records that already exist.

8) How would a machine generation flag relate to other
record-level flags in the Authority format? (record
completeness in Leader/17; how heading was constructed
in field 008/10 and /11; reference evaluation in field
008/29; level of establishment in field 008/33).


阅读(1397) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~