Digital Initiatives Home About the Digital Initiatives Services Research and Development Metadata Reports Ask Questions Virgo Catalog
University of Virginia
University of Virginia Library
Digital Initiatives: Research and Development

Fedora Object Architecture

General FEDORA Object Architecture
UVA Object Architecture

General FEDORA Object Architecture

 A digital object consists of a number of components that include a Universal Resource Name (URN), one or more Disseminators, and one or more Datastreams. These components are managed by the repository software. Below is a list of components that are included in an object.

Universal Resource Name (URN) - A URN is the unique persistent identifier for the object, maintained by the repository software. The UVA and Cornell implementations of FEDORA use the Handle system developed at CNRI for defining the protocol for URNs. Each object must be identified by a URN (or some other standards-based persistent identifier scheme).

Object Map - An object map is a component, maintained by the repository software, that is a map of all of the other object components that uniquely names each component within the object. The Object Map provides a mechanism for identifying and listing each component in the object. Each object must contain a single Object Map.

Datastream - A datastream consists of a typed stream of bytes that add content to the object (e.g., a digital image, an etext, a program, metadata, a database, a mapping or relational structure, etc.). A datastream must contain the following:

·         Name - identifier for the datastream

·         Description - textual description of the datastream

·         Content Type - mime type of the datastream

·         Datastream Type - There are two possible types:

1.      Internal - datastream is under the direct control of the repository system and is stored internally within the repository.

2.      Remote Referenced - datastream addresses a file that is outside the direct control of the repository through one of the supported communication protocols.

·         Contents - for an internal datastream, this is the byte stream itself as it is stored within the repository system. For a remote referenced datastream, this is a pointer to a file that contains the byte stream on a remote host.

Disseminator - A disseminator enables access to the datastreams contained in an object and consists of three subcomponents: signature, servlet, and attachment map. Each disseminator contains an attachment map (i.e., the datastreams that are "attached" to the disseminator) of those datastreams to which it has access. A disseminator also contains a mapping between each signature, to which the object subscribes, and its corresponding servlet. The signature contains the behavior descriptions used to define the possible behaviors provided by the disseminator and the servlet contains the implementation of those behaviors to execute each specific behavior. An object may contain one or more disseminators.

 For a more complete description of the FEDORA object model, please refer to the FEDORA site.

 UVA FEDORA Object Architecture

 In the General Object Model, an object is comprised of three components that include an identifier, disseminators and datastreams. The UVA Object Model differs slightly by defining a special naming scheme for groups of specialized datastreams as illustrated in figure 1. In the UVA Object Model, an object consists of four components that include an identifier, disseminators, metadata, and the basis. The identifier and disseminators are defined in the same manner as in the General Object model. The Metadata and Basis components are just datastreams as defined in the General Object Model, but we have chosen to assign a name to the group of datastreams that contain the metadata for an object (Metadata component), and a name to the group of datastreams that contain the actual content of the object (Basis component).

 Metadata - The metadata component consists of datastreams that contain ASCII text marked-up with XML tags. Multiple types of metadata datastreams are allowed, but each type must conform to a single XML schema/DTD. In the UVA implementation, there will be at least three different metadata datastreams defined for each object:

  1. Descriptive Metadata
    • Describes data about the intellectual content each component of the basis of an object
    • Required in every object
    • Referred to in the Object Map by a specific name (e.g., "desc")
    • To be indexed for user discovery, i.e., catalog
  2. Administrative Metadata
    • Describes data about the history, access rights, versioning, etc. of each component of the basis of an object
    • Required in every object
    • Referred to in the Object Map by a specific name (e.g., "admin")
    • To be indexed for repository management tasks
  3. Technical Metadata
    • Describes data about the format and technical characteristics of each component of the basis of an object
    • Optional
    • Referred to in the Object Map by a specific name (e.g., "tech")
    • To be indexed for both user and repository management tasks

The datastreams that comprise the metadata component are no different from the datastreams that comprise the basis so it should be possible to create disseminators that disseminate the content of the metadata datastreams similar to the manner in which this is done for datastreams in the basis. The datastreams that comprise the metadata component could exist as datastreams in the basis of the object, but we have chosen to keep them separate because of the special nature of metadata.

Basis - The basis is the resource represented by the object. The basis consists of one or more datastreams that define the resource. For example, the basis for an electronic text would consist of a single datastream that is the file (or a pointer to a file) containing the marked up text. If the resource is a digital image, then there might be multiple datastreams that each contains different resolutions of the same image (e.g., thumbnail, medium resolution, high resolution, etc.). The basis in this case refers to the collection of datastreams that make up the different resolutions of the same image. Each object must contain a single basis, although a basis may consist of more than one datastream.

The structure of the basis forms a classification scheme for objects. There are currently three kinds of objects that include:

  • Simple object - The basis contains a single datastream containing no references to other digital objects.
  • Complex object - The basis contains a single datastream that contains references to other digital objects. e.g., an etext document with embedded links to page images of the actual text.
  • Compound object - The basis contains multiple byte streams. e.g., an image object consisting of multiple resolutions (thumbnail, screen size, archive quality) all derived from a single scanned image. If one or more of these byte streams contains references to other digital objects the object is also considered a complex object

Digital Initiatives
University of Virginia
PO Box 400112
Charlottesville, VA 22904-4112

Digital Initiatives Home • UVa Library Home
Search the Library Site • UVa Home
Maintained by: dl@virginia.edu
Last Modified: Monday, June 02, 2008
© The Rector and Visitors of the University of Virginia