Digital Initiatives Home About the Digital Initiatives Services Research and Development Metadata Reports Ask Questions Virgo Catalog
University of Virginia
University of Virginia Library
Digital Initiatives: Reports

Digital Library Metadata Review and Planning Group Report

Ann Whitesideand Janis Kessler (co-chairs), Beth Blanton-Kent, Sherry Lake, Mary Prendergast, Andrew Rouner, Judith Thomas
May 23rd, 2003

The charge of the Metadata Committee was two-fold:

1.  Review the metadata sets that have been developed for the UVa digital library effort.  The metadata is organized in five sections that correspond to the descriptive and administrative sections of the Metadata Encoding and Transmission Standard (METS) standard. They are: descriptive, digital provenance, rights, source, and technical; the last four are grouped as "administrative". The group is to review the details of all five areas, with particular attention to the minimum requirements in each area. The Digital Library Research and Development (DLRD) group will make a presentation of the current metadata elements and discuss the reasoning behind them when the DLMRP begins meeting.

2.  Develop recommendations for any changes necessary to assure that our approach both meets our needs and conforms to community standards. The group should recommend a plan for creating a set of content-area-specific guidelines for best practice in the application of the five types of metadata, with particular attention to the descriptive and technical metadata needs of each area.  Recommendations should include: the process and format for developing the guidelines; the set of content areas that will be addressed; and designated ownership for developing the guidelines for each area.

The group first met in December 2002 to begin work.  At the first meeting, our charge was changed to include a review of the current DescMeta metadata set and the emerging FRBR schema for which a uvametadata DTD had been devised.  This necessitated an investment of several months in collectively gaining an understanding of FRBR, what its parameters are, how it might work, and whether or how we could envision implementing a FRBR model (uvametadata) for the UVa Digital Library.

METADATA MODEL ANALYSIS

After several meetings with Perry Roland and Thorny Staples, the group worked on its own to sort out the issues that arose as a result of the comparison of the DescMeta and FRBR/uvametadata models.

One of the basic problems with our revised charge has been the difficulty inherent in making a choice on a DTD (DescMeta or FRBR/uvametadata) based solely on a theoretical understanding of the structure.  Below is a table that shows the pros and cons of Descmeta and FRBR/uvametadata that the group members have devised.

Criteria

DescMeta Pro

DescMeta Con

FRBR/uvametadata Pro

FRBR/uvametadata Con

Mappability

A direct map because the elements are a single, repeatable set

   

Not clear to see what level fields should map to.

OCLC’s Humphrey Clinker report indicates that automatic mapping can be done, but without looking at the item it was not possible to tell when something was an expression or a manifestation.

Cataloging and time issues

We know how to catalog an object or collection for DescMeta

   

From our review of the literature, it looks to be cataloging-intensive.  It could require the cataloger to create four separate records (for one item) to fully describe the item in FRBR terms. 

Better/richer/more profitable discovery**

Can get a rich level of discovery if the cataloger creates relational links

 

Allows for richer, more complex levels of discovery

 

Which has a more complete standard?**

 

Does not mandate relationships between works

Mandates complete cataloging in terms of explicit relationships

We do not know how the authority work for links functions

Relation to other standards

Based on Dublin Core, which is flexible

DC extensions are arbitrary

Has a hierarchical structure, with required and restricted elements

Not yet a national standard, but is being looked at for ISO standardization

Use

Already in use, at UVa

 

Cataloging using FRBR Model included in Virtua: ILS, an ILS system

Not known to be in use by any organization at this time

Scalability

 

We assume DescMeta is scalable, but have not tested this

It may be that FRBR is a more scaleable model, but we have not seen this tested

 

Hierarchy

Can create hierarchy

Is dependent upon the cataloger to provide hierarchy

Attributes of FRBR are meant to create possibilities for automatic hierarchy creation

 

**Notes:

1.A cataloger has to create relational links in either Descmeta or FRBR.

2. Choosing a standard that is incomplete makes it difficult to judge.

Other issues the DLMRP group has considered are:

Work has already been done using DescMeta, and some evaluation needs to be made about how much work would need to go into changing to another DTD.  The group does not feel it has enough information to address this issue.

OCLC’s experiments with FRBR have led them to the conclusion that “Bibliographic records do not contain sufficient information to reliably identify expressions.”  For this reason, they have suggested replacing expressions with additional manifestation attributes.  (Humphry Clinker project, http://www.oclc.org/research/projects/frbr/clinker)

Another major concern for several members of the group is the extreme focus of FRBR on printed matter.  Texts are not cataloged in the same manner as other media – images, film, and cartographic documents.  To date, experimentation with FRBR has focused only on books and the MARC record.

The main problems that we encountered were not necessarily ones that will be solved by a particular DTD, but by establishing consensus and guidelines on best practices—i.e. defining what the various metadata categories mean in the context of actual Library projects.

RECOMMENDATION

The DLMRP Group recommends that UVa adopt DescMeta as its metadata standard at this time.

We further recommend that an appointed group look into conducting a pilot project that will (1) test the implementation of FRBR-based metadata on new data, and if successful, (2) look into the question of migration of current data from descmetada to uvametadata.  Such a pilot would involve testing various types of materials (text, images, film, music scores, cartographic materials, etc.).  A pilot would take some time, but would answer some of the questions raised by this group.

MINIMUM METADATA ELEMENTS COVERING ALL DIGITAL OBJECT TYPES

Descriptive Metadata

Definitions are provided, though the wording of the definitions may vary depending upon the media type.

Element:  Title

            Title proper

The title of the work transcribed exactly as to wording, order and spelling from the resource.

Element:  Agent

            Attribute:  Role

                        Value:  (Select - i.e. author)

First statement of responsibility if different from main entry or if no main entry heading; Person or entity responsible for the work “appearing” prominently on the item.

            Statements should be included that are of “bibliographic significance.”

Element:  Relation

            Element:  Bibref

                        Element:  Version

            Edition statement

                        Edition, issue, version as found on item

Element:  Mediatype

            Element:  Form

            Material (or type of publication) specific details

Used for images, cartographic materials, music, computer files (including numeric and statistical data), serial publications and in some cases microforms.

Element:  Agent

            Attribute:  Role

                        Value:  Publisher

            Name of publisher, distributor, etc.

Element:  Date

            Attribute:  Type

                        Value:  Published

            Date of publication or creation

Element:  Physdesc

            Attribute:  Type

                        Value:  Extent

            The extent of the item giving the number of physical units (i.e. p., leaves, maps, file size)

Element:  Desc

            Attribute:  Type  (select note, abstract, etc.)

            Notes qualify or amplify the formal description.

Element:  Identifier

            Standard number

International Standard Book Number (ISBN) or International Standard Serial Number (ISSN) or any other internationally agreed standard number for the item.

Element:  Covtime

            Time period covered by the content of the resource.  Describes the temporal

            characteristics of the content of the resource.

Element:  Covplace

            Element: boundedby

Bounding box of an item is defined by 4 boundaries usually latitude (2 points) longitude (2 points).  Used for cartographic material.

Element:  Culture

            The culture of origin or context for a given resource.

Element:  Language

The language(s) of the intellectual content of the digital resource (languages(s) in which the text is written or the spoken language(s) of an audio or video resource). Visual images do not usually have a language unless there is significant text in a caption or in the image itself.

Administrative Metadata

Element:  Digiprov

Contains elements to record information regarding the ultimate origin of a digital object and the derivation of its current elements.

            Element:  date

                        Date associated with an event in the life cycle of the resource.

            Element:  agent

                        An entity primarily responsible for an event in the life cycle of a digital resource.

            Element:  description

                        A short text that describes such things as the meaning, history or appearance.

Element:  Sourcesee descriptive metadata

Element:  adminrights

Contains elements to record information regarding access and use of a digital resource.

            Element:  policy

Element:  Technical

A wrapper element which contains elements used to describe the technical specifications of a digital resource.

            Element:  image

                        Contains technical metadata that describe a digital image.

                        Element:  compression

                                    The type and amount of digital compression e.g. Predictive 10:1, RLE - 2:1.

                        Element:  format

Information about the segmentation (tile/strip) and orientation of the image.

                        Element:  spatialmetrics

                                    Information about the dimensions of the image.

                        Element:  energetics

                                    Information about the pixel characteristics e.g. chromaticities, color map, response curve, etc.

            Element:  text

                        Contains technical metadata that describe a digital text object.

                        Element:  encoding

                                    Character encoding scheme used in the text object e.g. ASCII

                        Element:  markup

                                    The type of markup language used to encode the text object, e.g. SGML, SML, HTML, etc.

                        Element:  note

                                    Contains a note or annotation.

                        Element:  ocr

                                    The type of OCR software used to produce the text object.

                        Element:  word processor

The type/version of word processor product used e.g. Word, WordPerfect, etc

            Element:  audio

                        Contains technical metadata that describe a digital audio object.

                        Element:  bits per sample

                                    The number of bits in a digital audio sample i.e. quantization e.g. 16, 24.

                        Element:  channel

                                    Number and information about channels/tracks (e.g., 2-trk, 4-trk, 8-trk, etc.)

                        Element:  audio data rate

                                    Information about the mode and data rate of audio files in Kb/s e.g. 16, 44.1, 96 etc.

                        Element:  audio duration

                                    Duration of audio source material in time i.e. HH:MM:SSSS format.

                        Element:  audio sampling frequency

                                    The rate at which the audio was sampled e.g. 44.1KHz, 96KHz, etc.

                        Element:  audio sound field

                                    Aural space on source recording, e.g., monaural, stereo, surround sound, etc

            Element:  video

                        Contains technical metadata that describe a digital video object.

                        Element:  color

                                    Information describing color characteristics and specifications.

                        Element:  compression

                                    The type and amount of digital compression e.g. Predictive - 10:1, RLE - 2:1.

                        Element:  data rate

                                    The data rate of the video source item in Mb/s e.g. 4.0, 8.25, 100.0, etc.

                        Element:  duration

                                    Duration of video source item in time i.e. HH:MM:SSSS format.

                        Element:  frame

                                    The number of frames and frame rate of video source item.

                        Element:  resolution

                                    The horizontal and vertical dimensions in pixels and aspect ratio of the frame.

                        Element:  sound field

                                    The digital sound format used in the video source item; mono, stereo, DTS, etc.

                        Element:  video format

                                    Information describing the format specifications of the video.

                        Element:  geospatial

                        Contains technical metadata that describe cartographic objects.

Need elements for 1) point-and-vector information or raster information 2) Spatial Reference – coordinate system

            Element:  stats

                        Contains technical metadata that describe numeric/data objects.

Need elements for file description and maybe software needed to use the data

MINIMUM METADATA ELEMENTS REQUIRED FOR ALL DIGITAL OBJECTS

Identifier

            Standard number

International Standard Book Number (ISBN) or International Standard Serial Number (ISSN) or any other internationally agreed standard number for the item.

Title

            Title proper

The title of the work transcribed exactly as to wording, order and spelling from the resource.

Agent

            Attribute:  Role

                        Value:  (Select - i.e. author)

First statement of responsibility if different from main entry or if no main entry heading;

Person or entity responsible for the work “appearing” prominently on the item.

            Statements should be included that are of “bibliographic significance.”

Mediatype

            Element:  Form

            Material (or type of publication) specific details

Used for images, cartographic materials, music, computer files (including numeric and statistical data), serial publications, and in some cases microforms.

Date

            Attribute:  Type

                        Value:  Published

            Date of publication or creation, time period date, etc.)

Desc

            Attribute:  Type  (select note, abstract, etc.)

            Notes qualify or amplify the formal description.

digiprov

Element:  date

            Date associated with an event in the life cycle of the resource.

Contains elements to record information regarding the ultimate origin of a digital object and the derivation of its current elements.

adminrights

Element:  policy

Contains elements to record information regarding access and use of a digital resource.

PLAN FOR CREATING A SET OF CONTENT-AREA-SPECIFIC GUIDELINES FOR BEST PRACTICE

We recommend the development of best practices, based on the needs and requirements of specific content areas, to address a number of issues.  For e-texts, there do not seem to be consistent standards for indicating series information, collection information, or project information.  For images, there are no guidelines for what constitutes a collection, what different levels of “collection” might be; guidelines for metadata creation also need to be established.  These issues cut across most types of material that will be entering the Digital Library from many units.

The DLMRP Group suggests that several small task forces or working groups be created to look at best practices and to develop guidelines for each of the following areas: text, images, film and video, music scores and sound recordings, cartographic materials, and statistics.  The groups should include no more than three or four people who will look at current practices in their fields and suggest methods for translating these to the Digital Library efforts.  Each group will issue a report with suggested guidelines.

Those who have participated in the DLMRP Group would be good candidates for the groups in their respective areas of expertise.

Digital Initiatives
University of Virginia
PO Box 400112
Charlottesville, VA 22904-4112

Digital Initiatives Home • UVa Library Home
Search the Library Site • UVa Home
Maintained by: dl@virginia.edu
Last Modified: Monday, August 03, 2009
© The Rector and Visitors of the University of Virginia