Authority Record Matching

After preliminary processing is completed, compressed “match keys” from bibliographic record headings are compared first against an index of match keys extracted from 1XX/4XX fields in LC/PCC authority records, and, secondarily, LTI authority records.

To increase authority record links, certain tag variations in bibliographic record headings are ignored during match key comparisons. For example, if a corporate name (110/610/710/810 field) has been improperly tagged as a personal name (100/600/700/
800 field), the link will not only be made with the proper form of the heading, but the incorrect tag will be corrected automatically.

Some headings do not link because of incorrect subfield coding, as when a chronological subdivision ($y) is miscoded as topical ($x) and vice versa. LTI scans for and corrects such problems.

A common reason headings fail to match an authority record is simply because no authority record exists for the heading. Name authority records date back only to 1977 and subject authority records are a reflection of LCSH rather than an exhaustive file of subjects and subject subdivisions appearing in MARC records. Very few name headings and only a fraction of the thousands of combinations of topical, form, chronological, and geographic subdivision headings receive their own LC subject authority records.

Other initial linkage failures are caused by typographical errors, headings constructed under earlier cataloging rules, variant treatment of names with prefixes, and direct versus indirect division of geographic names. Most of these headings are corrected and linked to an authorized heading. To achieve these links, bibliographic record headings are submitted to repeated manipulations and checks by computer.

Name Authority Matching

Many personal name headings (100/600/700/800) fail to link with an authority heading because of variations in the fullness of birth and death dates or variations in “titles and other words associated with the name” (subfield $c) information. Tests are applied to personal name headings that disregard minor variations to maximize authority record links. For example, when the match key created for the personal name heading Allingham, Helen Paterson, $c"Mrs.William Allingham,"$d1848-fails to link with an authority record, the heading is checked without subfield $c data. If the heading still does not link, another match is tried which accepts any death date in $d, if the birth date matches. This second match key refinement allows a valid link with the LC authority heading Allingham, Helen Paterson, $d1848-1926. When a link is found to an authority record, the form in the authority record replaces the form in the incoming heading.

Title Authority Matching

Title portions of name/title headings (X00/X10/X11 fields containing $t or $k) and preferred title fields (130/240/630/730/830) undergo special treatment to increase matches. In each case the title portion of the field is checked for “floating” elements which are corrected and validated separately. These include dates ($f), languages ($l), medium designator ($h), and versions ($s). Music headings also have the arrangement statement ($o), key ($r), and medium of performance ($m) corrected and validated. LTI processing attempts to link the heading fully to an authority record. If the entire name/title heading cannot be matched, trailing subfields are omitted to find the most complete match possible. Often only the name portion of a name/title heading can be linked to an authority record.

Untraced Series Processing

In 2006, the Library of Congress discontinued controlled access to series in bibliographic records and stopped creating series authority records. Since then, the number of series authority records has grown less rapidly, although there remains a substantial backfile of existing LC series authority records, along with new national-level records created by NACO libraries. Over the years LTI has also created many series authority records in-house, used to match incoming headings in the absence of a national-level record. While untraced series (i.e., those tagged as 490 with first indicator 0) were formerly excluded from standard authority control processing, the changed LC policy makes it important that libraries have a mechanism to continue to treat series as authorized access points.

To ensure consistency in access to series, LTI’s no-charge “untraced series” option includes 490 0 fields in authority control. If the first indicator in a 490 field is set to 0, LTI changes the indicator value to 1 and generates an appropriate 8XX (traced series) field. If the newly created 8XX field is linked to an authorized access point during authority control, both the 490 1 and authorized 8XX fields are retained.

If the LTI-created 8XX field remains unlinked following authority control, libraries can opt to either delete the LTI-created 8XX field and change the 490 first indicator back to 0, or, retain the 490 (formerly, “traced differently” but now defined as “series traced in 8XX field”) series with a first indicator of 1 and the LTI-created (but unlinked) 8XX series heading. When the form in the 8XX field cannot be matched to an authority record, LTI recommends that the 490 1/8XX combination be retained. For libraries retaining the 490 1/830 series, we further advise that 490 1 fields not be indexed in “title browse” indexes. The same advice would apply to 490 0 fields but, under this option, none of these fields will remain in the database. Should an authority record later be distributed, it will be provided to the library when the heading is authorized when it is next encountered in a continuing authority control run, whether Authority Express (AEX) or Authority Update Processing (AUP).

The 2008 MARBI decision to cease use of the 440 field for a traced title series clarified the line between transcription and access points. It also redefined the 490 first indicator value 1 to “Series traced in 8XX field.” Unless otherwise instructed, LTI processing converts obsolete 440 fields into the currently defined 490 1/8XX paired fields.

Subject Authority Matching

All subject headings are first processed through a “subject fix” program that corrects common errors in subject headings and fixes incorrect and obsolete topical, chronological, and geographic subdivisions.

Using tables of commonly occurring names, jurisdictions, subjects, and subject subdivisions, abbreviations are expanded and subjects and subject subdivisions are updated. For example:

Gt. Brit. becomes Great Britain.

Women, British becomes Women$zGreat Britain.

U.S.$xRace question becomes United States$xRace relations.

The West$xBiog. becomes West (U.S.)$xBiography.

Headings are then matched against the LC subject authority file. Because most subject headings with subdivisions do not have separate authority records, LTI maintains tables of “free floating” topical, form, chronological, and geographic subject subdivisions. Subdivisions in these tables are used to validate thousands of headings that would otherwise not link. Headings that have not fully linked are broken into their component parts and tested against subject authority records and tables of validated floats.

Unlinked personal, corporate, conference, or geographic name (600/610/611/651 fields) headings are checked against the LC name authority file (ignoring subfields $v, $x, $y, and $z) to validate the form of name wherever possible. Later, subfields $v, $x, $y, and $z are validated against float tables. Where libraries have used double dashes in place of appropriate subject subdivision coding, LTI's software recognizes the subdivision and inserts the proper subfield code.

Obsolete LC subject headings replaced by two or more valid headings (i.e., “split” headings) present special authority control problems. For example, the topical heading Negroes was replaced by the headings Blacks and Afro-Americans; then, in 2001, Afro-Americans was changed to African Americans. In some cases the correct division of split headings can be deduced from other parts of the heading. Thus, LTI automatically converts headings having the structure Negroes$z[State] to African Americans$z[State], e.g., the heading Negroes$zMississippi is changed to African Americans$zMississippi. The heading Negroes followed by a country other than United States is changed to Blacks.

Where there is insufficient information to make the split, LTI links all occurrences of the obsolete heading to the broader heading. For example, Crime and criminals will go to Crime. However, if Crime and criminals is followed by the subdivision Biography, the new heading is changed to Criminals.

Geographic Subdivisions

LTI maintains a file of 170,000 correct indirect geographic subdivision forms. In addition, software is used to convert direct geographic subject subdivisions into indirect form. Geographic subject subdivisions are identified and fixed even when $z data has been improperly coded as $x or $y or appended to $a. Where necessary, higher level jurisdictions are inserted. For example, the topical subject heading:

650 0$aDrawing, Paris$xCatalogs.

is changed to

650 0$aDrawing$zFrance$zParis$vCatalogs.

and the heading:

650 0$aGermans in Waterloo Co., Ontario.

is changed to

650 0$aGermans$zOntario$zWaterloo (Regional municipality).

Topical ($x) or Form ($v) Subdivisions

Certain subdivision headings may be coded as topical ($x) or form ($v). Authority control processing must attempt to determine the usage in each case. In general, when such a subdivision is the last subdivision in the heading, it will be coded as $v, except when there is a full link to an LC authority record with the subdivision in $x. Conversely, if it is not the last subfield in the heading, it will be tagged as $x, unless it has been identified as a special case which allows $x to follow $v (e.g., $vDictionaries $x[language]) or when it is permissible to follow $v with another $v.

Role of Added Cross-References (Variant Access Points)

An important component to LTI's ability to offer a guaranteed link rate is a supplemental file of over 4.9 million added cross-references. Name headings constitute 88% of these with the remaining 12% subject cross-references. Each reference links the form of the heading used in the bibliographic record to the correct form used in the LC authority record.

These references are not added to LC authority records, but rather cumulated in separate files that are processed after linking to LC authority records, but before linking to LTI authority records. For example, when an LTI editor links manually the library heading Marquand, John P., $d1893- to the LC authority record heading Marquand, John P. $q(John Phillips),$d1893-1960, that cross-reference is saved and applied subsequently to other library databases. Since not all cross-references can be applied to every database, editor-created variant access points undergo a second review prior to being added to LTI’s cross-reference file.

Because so many machine-readable records are derived from shared cataloging databases, these supplemental cross-references play an important part in LTI's authority control routines, serving two purposes. First, they correct the access point in the library database that prompted their creation. Second, when the same unlinked heading appears in another customer's database, that heading is automatically linked to the proper authority record.

LTI Authority Record Linking

LTI maintains a proprietary file of name and subject authority record headings currently numbering 2.4 million. These authority records, established in accord with RDA and LC cataloging practice, are created from validated but unlinked access points that have appeared in library databases. The purpose of LTI authority records is to provide consistency in the absence of a nationally distributed authority record, and reduce the number of unlinked library headings that need to be reviewed by LTI editors. By eliminating headings that have already been reviewed in a prior job, editors are able to focus on the unlinked headings that can most benefit from review. Libraries that plan to examine the unlinked headings report are also helped by exclusion of these valid headings from that report.

To qualify as an LTI authority record heading, the heading must meet three criteria:

  1. it has been searched thoroughly but not found in the LC authority files
  2. it is coded properly (tags, indicators, and subfield codes)
  3. it conforms to national cataloging standards
  4. it is sufficiently unique that it is unlikely to represent more than one entity

If an LC authority record is later distributed for the heading, the LTI authority record is deleted. Library access points are matched against LTI authority record headings only after they have failed to link to an LC authority record. In a typical database, LTI authority records validate 3% to 4% of the headings and most of these represent no change to the source heading—i.e., the access point in the library's bibliographic record is identical to the LTI authority record heading.

Review of Unlinked Headings

After all possible machine matches have been made, LTI editors selectively search unlinked headings in the LC name and/or subject authority file. Typical targeted headings include those which occur with a high frequency and those with illegal data, whether characters or subfields. Editors have access to a fully indexed version of the library's database and can display bibliographic records to resolve ambiguous headings.

Incorrect tags and subfield codes, typographical errors, omitted or incorrect dates, and related problems are corrected manually and resubmitted for relinking. In some cases, the heading alone is ambiguous, matching 4XX fields on several different authority records, as with initialisms and acronyms. If all the bibliographic headings are, in fact, a single entity, editors temporarily force the correct link for that specific database.