Digital Initiatives Home About the Digital Initiatives Services Research and Development Metadata Reports Ask Questions Virgo Catalog
University of Virginia
University of Virginia Library
Digital Initiatives: Metadata

Metadata Decisions and Best Practices

Metadata Home > Decisions and Best Practices

 

Coding set (i.e. collection) identification

The MSG resolved to avoid semantically loaded terms, and agreed to use "set" rather than "collection" to describe the various types of aggregated materials.

  • Sets can consist of the following:
    • Materials purchased from a vendor as concrete units
    • Materials created by the Library (or faculty projects) as concrete units
    • Materials brought together by the Library as being usefully related but not inherently related to each other
  • Materials inherently related to each other by their bibliographic nature are considered series (i.e. electronic texts bearing a series statement)
  • The set code only serves to link the individual object back to the collection object. There is no hierarchical relationship implied in the <set> elements. All hierarchy will be assumed by the collection objects.
DescMeta

<surrogate>
<set code="UVA-LIB-ArtArchit"/>
</surrogate>

GDMS

<gdmshead>
<filedesc>
<setstmt>
<set code="UVA-LIB-ArtArchit"/>
</setstmt>
</filedesc>
</gdmshead>

TEI

<filedesc>
<seriesStmt>
<title level="s">University of Virginia, Modern English collection</title>
<idno type="uva-set">UVA-LIB-ModEngl</set>
</seriesStmt>
</filedesc>

Set code conventions

  • Set codes will begin with UVA-LIB for any sets created or collected by the Library. This includes faculty projects that have been selected and collected by the Library.
  • For vendor collections, set codes will begin with standard abbreviations in all caps (i.e. SI-SAAM for the Smithsonian Institute, Smithsonian American Art Museum )
  • For the remainder of the set code, abbreviations should be pulled, where possible, from the standardized list, The List of Title Word Abbreviations (following the ISO 4 standard, Rules for the abbreviation of title words and titles of publications) and follow their abbreviation conventions without following their punctuation or capitalization rules.
  • There should be a central authority for determining the "official" name of a set (regardless of which set category it falls into).
  • Once the official name has been determined, the code should prefixed as above, followed by a hyphen, followed by the standardized abbreviation with all words strung together as a compound word.
  • The first word of each abbreviated title should be capitalized.
  • All set codes must be unique.
  • Given the above formulation, the phase 2 sets are:
    • The Art and Architecture collection: UVA-LIB-ArtArchit
    • The Barcelona collection: UVA-LIB-Barcelona
    • The Architecture of Jefferson Country: UVA-LIB-ArchitJeffCtry
    • The Catlin collection: SI-SAAM-CatlinIndianPaint (from The Smithsonian American Art Museum Catlin Indian Paintings Collection)
    • The Fowler collection: UCLA-FOWLER-AfrArt
    • The Modern English Text collection: UVA-LIB-ModEngl
    • The Finding Aids collection: UVA-LIB-FindAids

 

Descriptive Metadata for Images

Image objects will only contain:

  • Their parent pointer: <idno type=parent">
  • A label:
    • For page images (based on the existence of a pb tag), the label will be:

      book title, page [value of n= (page number)].

    • For figures (based on the existence of a fig tag), the label will be:

      book title, [fig caption]

    Some figures have extensive captions. For phase 2, we will grab the entire caption. If this turns out to be unwieldy, we'll consider limiting captions to a certain number of characters only for phase 3.

  • Their rights information. Page images can inherit their rights from their parent, but the images referred to by GDMS objects must know their individual rights.

This descriptive metadata for image objects will not populate the discovery index and full descriptive metadata will be inherited from the parent on demand.

 

Where to store PIDS in the "master" metadata records

All "master" records should contain their Fedora PIDS. When files are spun off from their masters for archiving or other purposes, their PIDS will already be embedded in the metadata

TEI

<fileDesc>
<publicationStmt>
<idno type="uva-pid">

GDMS

<gdmshead>
<idno type="uva-pid">

EAD <filedesc>
<publicationstmt>
<num type="uva-pid">
TIFF in the TIFF dump (for phase 2), not the actual TIFF headers

 

Dealing with TEI headers that represent serials & monographic sets

The challenge: Each issue/volume of a multi-volume publication is a separate file, will be a separate Fedora Object, and needs a separate TEI header.  We can create alternate titles in the TEI to ease searchability, but we don't want to populate the discovery index with each volume's metadata.  To achieve this end:

Individual volume/issue headers

  • Each individual issue/volume has it's own individual TEI header which describes the issue/volume in hand.
  • Each individual TEI header, by extraction from VIRGO, has an element: <idno type="UVa Title Control Number">
  • Each individual TEI header has an element identifying the "form" of the item. This is coded as follows:

    <profileDesc>
    <keywords scheme="uva-form">
    <term>periodical issue</term>
    </keywords>
    </profileDesc>

    The scheme "uva-form" must also be declared:

    <classDecl>
    </taxonomy>
    <taxonomy id="uva-form">
    <bibl>UVa Library Form Categories</bibl>
    </taxonomy>
    </classDecl>

    The uva-form keyword scheme is a locally developed thesaurus. Valid terms for this scheme currently are:

    article
    broadside
    manuscript
    monograph
    monographic set
    monographic volume
    newspaper
    newspaper issue
    periodical
    periodical issue
    periodical volume
    serial
    serial volume

    Please contact Erin Stalberg, MSG Chair, if you need additional terms to be added to this list.

    An additional advantage of using <keywords scheme="uva-form"> in this way is that the digital library will be able to group hits based on particular uva-form values. The user will be able to scan a hit list and have their hits groups by uva-form, i.e. first the monograph hits and then the periodical hits.

  • We modified the TEI DTD to be able to use<biblScope> within <fileDesc> and within <sourceDesc><bibFull>

    <fileDesc>
    <titleStmt>
    <title n="245|a" type="main">The Cavalier Daily</title>
    <biblScope type="volume"><num value="79">79th Year</num></biblScope> <biblScope type="issue"><num value="2">Number 2</num></biblScope>
    <biblScope type="date">
    <date value="1968-09-13">Friday, September 13, 1968</date>
    </biblScope>
    </titleStmt>
    </fileDesc>

    and

    <sourceDesc>
    <biblFull>
    <titleStmt>
    <title n="245|a" type="main">The Cavalier daily</title>
    <biblScope type="volume"><num value="79">79th Year</num></biblScope>
    <biblScope type="issue"><num value="2">Number 2</num></biblScope>
    <biblScope type="date">
    <date value="1968-09-13">Friday, September 13, 1968</date>
    </biblScope>
    </titleStmt>
    </biblFull>
    </sourceDesc>

Standalone headers

  • Standalone TEI headers are created to represent the serial or monographic set as a whole. The UVa standalone header is based on the concept of TEI Independent Headers (see the TEI website). We have not used the practice as written in TEI, however, because the TEI Independent Header DTD does not allow for accompanying extension files as the normal TEI DTD does. We have adapted the concept with local modification.
  • Standalone TEI headers include also the <keywords scheme="uva-form"> element describing the parent. For example:

    Individual TEI header (child)

    <keywords scheme="uva-form">
    <term>periodical issue</term>
    </keywords>

    Standalone TEI header (parent)

    <keywords scheme="uva-form">
    <term>periodical</term>
    </keywords>

    "uva-form" is also declared in the <classDecl> for the standalone header as described above.

  • Each iteration of a title change will have a separate Standalone Header.
  • The digital library software will link the Standalone (parent) headers to the Individual (children) headers by the value of the UVa Title Control Number idno.
  • Only the Standalone Header will populate its metadata to the digital discovery index. Therefore, when searching the digital discovery index, the user will first locate the parent and then find all it's children (or relations, in the case of title changes). When searching the full-text TEI index, the user will be able to discover both the parent and children records separately.
  • A new Fedora content model will be developed.

 

TEI sort titles

The MARC-to-TEI script generates sort titles based on the indicator values in the MARC record's 245 tag. The script also normalizes the data based on the NACO Normalization Rules and converts all characters to lower case. The DL will sort based on the data in the element <title type="sort"> and display on the element <title type="main">.

MARC

245 04 $a The Cavalier daily

TEI

<title n="245|a" type="main">The Cavalier daily</title>
<title type="sort">cavalier daily</title>

 

Overview of IRIS import/export processes

PDF documents created by Jack Kelly to document his workflow for importing data to and exporting data from IRIS using Perl & Applescripts.

IRIS-GDMS
Data Import into IRIS

 

Access Rights

The level of access that a member of the UVa community or the general public can have to this resource. Currently there are 4 valid values only.

Machine processable data Display values
public Publicly accessible
uva Accessible to UVa community only
viva Accessible to VIVA community only
restricted Restricted to Library staff for management only

Note: VIVA is the Virtual Library of Virginia. UVa hosts a number of resources on their behalf.

  • Access Rights must confirm to one of the above. If you have additional restrictions not accounted for above, please contact Erin Stalberg, MSG Chair.

 

Digital Initiatives
University of Virginia
PO Box 400112
Charlottesville, VA 22904-4112

Digital Initiatives Home • UVa Library Home
Search the Library Site • UVa Home
Maintained by: dl@virginia.edu
Last Modified: Monday, June 02, 2008
© The Rector and Visitors of the University of Virginia