Digital Library Metadata Review and Planning Group Report
| Ann Whitesideand Janis Kessler (co-chairs),
Beth Blanton-Kent, Sherry Lake, Mary Prendergast, Andrew Rouner, Judith Thomas May 23rd, 2003 The charge of the Metadata Committee was two-fold: 1. Review the metadata sets that have been developed for the UVa digital library effort. The metadata is organized in five sections that correspond to the descriptive and administrative sections of the Metadata Encoding and Transmission Standard (METS) standard. They are: descriptive, digital provenance, rights, source, and technical; the last four are grouped as "administrative". The group is to review the details of all five areas, with particular attention to the minimum requirements in each area. The Digital Library Research and Development (DLRD) group will make a presentation of the current metadata elements and discuss the reasoning behind them when the DLMRP begins meeting. 2. Develop recommendations for any changes necessary to assure that our approach both meets our needs and conforms to community standards. The group should recommend a plan for creating a set of content-area-specific guidelines for best practice in the application of the five types of metadata, with particular attention to the descriptive and technical metadata needs of each area. Recommendations should include: the process and format for developing the guidelines; the set of content areas that will be addressed; and designated ownership for developing the guidelines for each area. The group first met in December 2002 to begin work. At the first meeting, our charge was changed to include a review of the current DescMeta metadata set and the emerging FRBR schema for which a uvametadata DTD had been devised. This necessitated an investment of several months in collectively gaining an understanding of FRBR, what its parameters are, how it might work, and whether or how we could envision implementing a FRBR model (uvametadata) for the UVa Digital Library. METADATA MODEL ANALYSISAfter several meetings with Perry Roland and Thorny Staples, the group worked on its own to sort out the issues that arose as a result of the comparison of the DescMeta and FRBR/uvametadata models. One of the basic problems with our revised charge has been the difficulty inherent in making a choice on a DTD (DescMeta or FRBR/uvametadata) based solely on a theoretical understanding of the structure. Below is a table that shows the pros and cons of Descmeta and FRBR/uvametadata that the group members have devised.
**Notes: 1.A cataloger has to create relational links in either Descmeta or FRBR. 2. Choosing a standard that is incomplete makes it difficult to judge. Other issues the DLMRP group has considered are: Work has already been done using DescMeta, and some evaluation needs to be made about how much work would need to go into changing to another DTD. The group does not feel it has enough information to address this issue. OCLCs experiments with FRBR have led them to the conclusion that Bibliographic records do not contain sufficient information to reliably identify expressions. For this reason, they have suggested replacing expressions with additional manifestation attributes. (Humphry Clinker project, http://www.oclc.org/research/projects/frbr/clinker) Another major concern for several members of the group is the extreme focus of FRBR on printed matter. Texts are not cataloged in the same manner as other media images, film, and cartographic documents. To date, experimentation with FRBR has focused only on books and the MARC record. The main problems that we encountered were not necessarily ones that will be solved by a particular DTD, but by establishing consensus and guidelines on best practicesi.e. defining what the various metadata categories mean in the context of actual Library projects. RECOMMENDATIONThe DLMRP Group recommends that UVa adopt DescMeta as its metadata standard at this time. We further recommend that an appointed group look into conducting a pilot project that will (1) test the implementation of FRBR-based metadata on new data, and if successful, (2) look into the question of migration of current data from descmetada to uvametadata. Such a pilot would involve testing various types of materials (text, images, film, music scores, cartographic materials, etc.). A pilot would take some time, but would answer some of the questions raised by this group. MINIMUM METADATA ELEMENTS COVERING ALL DIGITAL OBJECT TYPESDescriptive MetadataDefinitions are provided, though the wording of the definitions may vary depending upon the media type. Element: Title Title proper The title of the work transcribed exactly as to wording, order and spelling from the resource. Element: Agent Attribute: Role Value: (Select - i.e. author) First statement of responsibility if different from main entry or if no main entry heading; Person or entity responsible for the work appearing prominently on the item. Statements should be included that are of bibliographic significance. Element: Relation Element: Bibref Element: Version Edition statement Edition, issue, version as found on item Element: Mediatype Element: Form Material (or type of publication) specific details Used for images, cartographic materials, music, computer files (including numeric and statistical data), serial publications and in some cases microforms. Element: Agent Attribute: Role Value: Publisher Name of publisher, distributor, etc. Element: Date Attribute: Type Value: Published Date of publication or creation Element: Physdesc Attribute: Type Value: Extent The extent of the item giving the number of physical units (i.e. p., leaves, maps, file size) Element: Desc Attribute: Type (select note, abstract, etc.) Notes qualify or amplify the formal description. Element: Identifier Standard number International Standard Book Number (ISBN) or International Standard Serial Number (ISSN) or any other internationally agreed standard number for the item. Element: Covtime Time period covered by the content of the resource. Describes the temporal characteristics of the content of the resource. Element: Covplace Element: boundedby Bounding box of an item is defined by 4 boundaries usually latitude (2 points) longitude (2 points). Used for cartographic material. Element: Culture The culture of origin or context for a given resource. Element: Language The language(s) of the intellectual content of the digital resource (languages(s) in which the text is written or the spoken language(s) of an audio or video resource). Visual images do not usually have a language unless there is significant text in a caption or in the image itself. Administrative MetadataElement: Digiprov Contains elements to record information regarding the ultimate origin of a digital object and the derivation of its current elements. Element: date Date associated with an event in the life cycle of the resource. Element: agent An entity primarily responsible for an event in the life cycle of a digital resource. Element: description A short text that describes such things as the meaning, history or appearance. Element: Sourcesee descriptive metadata Element: adminrights Contains elements to record information regarding access and use of a digital resource. Element: policy Element: Technical A wrapper element which contains elements used to describe the technical specifications of a digital resource. Element: image Contains technical metadata that describe a digital image. Element: compression The type and amount of digital compression e.g. Predictive 10:1, RLE - 2:1. Element: format Information about the segmentation (tile/strip) and orientation of the image. Element: spatialmetrics Information about the dimensions of the image. Element: energetics Information about the pixel characteristics e.g. chromaticities, color map, response curve, etc. Element: text Contains technical metadata that describe a digital text object. Element: encoding Character encoding scheme used in the text object e.g. ASCII Element: markup The type of markup language used to encode the text object, e.g. SGML, SML, HTML, etc. Element: note Contains a note or annotation. Element: ocr The type of OCR software used to produce the text object. Element: word processor The type/version of word processor product used e.g. Word, WordPerfect, etc Element: audio Contains technical metadata that describe a digital audio object. Element: bits per sample The number of bits in a digital audio sample i.e. quantization e.g. 16, 24. Element: channel Number and information about channels/tracks (e.g., 2-trk, 4-trk, 8-trk, etc.) Element: audio data rate Information about the mode and data rate of audio files in Kb/s e.g. 16, 44.1, 96 etc. Element: audio duration Duration of audio source material in time i.e. HH:MM:SSSS format. Element: audio sampling frequency The rate at which the audio was sampled e.g. 44.1KHz, 96KHz, etc. Element: audio sound field Aural space on source recording, e.g., monaural, stereo, surround sound, etc Element: video Contains technical metadata that describe a digital video object. Element: color Information describing color characteristics and specifications. Element: compression The type and amount of digital compression e.g. Predictive - 10:1, RLE - 2:1. Element: data rate The data rate of the video source item in Mb/s e.g. 4.0, 8.25, 100.0, etc. Element: duration Duration of video source item in time i.e. HH:MM:SSSS format. Element: frame The number of frames and frame rate of video source item. Element: resolution The horizontal and vertical dimensions in pixels and aspect ratio of the frame. Element: sound field The digital sound format used in the video source item; mono, stereo, DTS, etc. Element: video format Information describing the format specifications of the video. Element: geospatial Contains technical metadata that describe cartographic objects. Need elements for 1) point-and-vector information or raster information 2) Spatial Reference coordinate system Element: stats Contains technical metadata that describe numeric/data objects. Need elements for file description and maybe software needed to use the data MINIMUM METADATA ELEMENTS REQUIRED FOR ALL DIGITAL OBJECTSIdentifierStandard number International Standard Book Number (ISBN) or International Standard Serial Number (ISSN) or any other internationally agreed standard number for the item. TitleTitle proper The title of the work transcribed exactly as to wording, order and spelling from the resource. AgentAttribute: Role Value: (Select - i.e. author) First statement of responsibility if different from main entry or if no main entry heading; Person or entity responsible for the work appearing prominently on the item. Statements should be included that are of bibliographic significance. Mediatype Element: Form Material (or type of publication) specific details Used for images, cartographic materials, music, computer files (including numeric and statistical data), serial publications, and in some cases microforms. DateAttribute: Type Value: Published Date of publication or creation, time period date, etc.) Desc Attribute: Type (select note, abstract, etc.) Notes qualify or amplify the formal description. digiprov Element: date Date associated with an event in the life cycle of the resource. Contains elements to record information regarding the ultimate origin of a digital object and the derivation of its current elements. adminrights Element: policy Contains elements to record information regarding access and use of a digital resource. PLAN FOR CREATING A SET OF CONTENT-AREA-SPECIFIC GUIDELINES FOR BEST PRACTICEWe recommend the development of best practices, based on the needs and requirements of specific content areas, to address a number of issues. For e-texts, there do not seem to be consistent standards for indicating series information, collection information, or project information. For images, there are no guidelines for what constitutes a collection, what different levels of collection might be; guidelines for metadata creation also need to be established. These issues cut across most types of material that will be entering the Digital Library from many units. The DLMRP Group suggests that several small task forces or working groups be created to look at best practices and to develop guidelines for each of the following areas: text, images, film and video, music scores and sound recordings, cartographic materials, and statistics. The groups should include no more than three or four people who will look at current practices in their fields and suggest methods for translating these to the Digital Library efforts. Each group will issue a report with suggested guidelines. Those who have participated in the DLMRP Group would be good candidates for the groups in their respective areas of expertise. |