Repository Image Object Model Committee Report
February 2003
|
Committee members: Rob Cordaro, Bradley Daigle, Ronda Grizzle, Jack Kelly, Michael Tuite, Ross Wayland BackgroundIn April 2003, the Library will be ready to begin populating a Fedora Repository with content, but before that can happen some basic decisions need to be made concerning the type of data object models that will be used to construct the Fedora digital objects. The two initial collections slated for inclusion in the repository will consist of the EAD Finding Aid collection and a subset of texts from the Lewis & Clark collection. Both of these collections will involve both texts and images. This group was tasked with constructing a set of basic object models for images that could be used for images associated with the EAD and Lewis & Clark collections and that could also serve as a foundation for image objects across the library. The committee acknowledges that different areas of the Library may have very different requirements in regard to presentation and delivery of image content, but the focus of this committee was to try and find a low-level common denominator that would work across most of the Librarys image collections. Specialized delivery and functionality can be achieved in the future by adding additional disseminators for those areas that require more than the basic functionality proposed in these base set of object models. New object models can also be added if it is determined that none of the existing models adequately fit the needs and functionality of the object. Object Model DefinitionThe term object model or content model is used to describe the structure of a group of related objects in a Fedora repository. A Fedora object has four basic components: 1) A persistent identifier or PID that uniquely identifies the object in a given Fedora repository, 2) a set of Disseminators that define a set of behaviors the object can perform, 3) a set of descriptive and administrative metadata about the object and its content, and 4) one or more datastreams that define the content of the object. Figure 1 depicts a graphical representation of the general Fedora object model. Objects are said to subscribe to the same object model if they share the same basic object structure by having the same number and type of datastreams (content streams) and by having the same set of disseminators or behaviors. Figure 1. Fedora Object Model. Role of Object Models Fedora does not require that objects share the same object model. In fact, one could create a different object model for every object in the repository, but doing so would provide little benefit over many of the web sites the Library is currently managing. By carefully designing object models and Behavior Definitions, one can leverage common functionality and delivery tools across large collections of similar objects. If the Library is to successfully manage its rapidly expanding volume of digital content, we have to consider approaches that simplify both the management and maintenance aspects of digital object creation, storage, and delivery. Defining an object model also does not mean that all objects of the same media type have to fit a single model. The goal is to carefully consider each object model to make it as generic and flexible as possible. There will undoubtedly be exceptions, but the goal is to carefully consider each exception to see if it really warrants a new object model. One of the key features of the Fedora architecture is the ability to enable different objects to share the same Behavior Definition or set of behaviors. Carefully designing the Behavior Definition for a large class of objects can mean that a single Behavior Definition can be shared across multiple object models that benefits both the managing and delivery of digital objects. Image Object ModelsThere are many types of images including 1 bit black & white page image scans, 24-bit color art images, satellite image maps, geographic maps, architectural drawings, musical score images, and scientific data visualization images. Although the focus is currently on page images and illustrations that appear in the EAD and Lewis & Clark collections, we also considered other collections in areas including art and architecture and science and engineering. After reviewing the many different types of images, the group arrived at a set of three basic image object models we think will work for the majority of Library images:
The General Image Object Model is the primary object model for images that should work for a large number of images. Its design includes separate datastreams for preview- and screen-sized images to provide quick performance for these two common sizes of images. The third datastream contains a MrSID version of the image that can be used to dynamically deliver any size of the image through software. The fourth datastream represents the Delivery Master which is the highest resolution available of the image and is the image that was used to derive the other datastreams in the object. At the present time, it is unclear whether there is sufficient infrastructure or budget to accommodate on-line or near-line storage of these very large Delivery Master images. It is desirable to have this capability in the future, but for now the best we may be able to offer for this datastream is a pointer to an xml or text file indicating information about where the Delivery Master can be found off-line. The number of datastreams in an object model represents only half of the object structure. The remaining half is defined by the number and types of disseminators it contains. The disseminators define the functionality of the object by describing a set of behaviors or Behavior Definitions. The uvaImage disseminator consists of five behaviors
The getImageViewer behavior delivers the image but also provides additional image manipulation tools through a Java applet interface. These additional tools would provide basic viewing tools that are deemed useful to a wide range of audiences. This behavior is not meant to supplant more sophisticated application tools like Adobes PhotoShop for image analysis. For detailed image analysis or viewing, the end user would most likely download the image and use existing desktop applications like PhotoShop or other image client software such as Luna Insights image client. The intent for the viewing tool is to provide a simple, low-end viewing tool that can be made available in a web browser environment for casual viewing, presentations, and teaching. Features of this viewing tool should include the following features:
Each digital object will also contain descriptive and administrative metadata about the object as a whole and about each of its content streams (datastreams). The uvaMetadata disseminator will be available on every object and will provide the capability to retrieve descriptive and administrative about the object and its content. Figure 2. General Image Object Model 1-Bit Image Object Model The 1 Bit Image Object Model is a special case object model for images that deals with 1-bit black & white Group4-Fax-compressed TIFF images. This format of image is common in collections like the Brittle Book collection and other collections where high-resolution 24-bit color scans are deemed unnecessary. This image format results in a very small file size so conversion to a MrSID image format is neither practical nor desirable. There also exists very efficient software to translate from this format into GIF and JPEG formats so there is no performance need to have separate datastreams for preview and screen size images since they can be generated dynamically. The TIFF file serves as its own Delivery Master since it represents the highest resolution available for the image. This object model has only a single datastream consisting of the 1-bit TIFF file. Even though the 1-Bit Image Object Model has a different number of datastreams than the General Image Object Model, it can still share the same uvaImage disseminator. It will have a different mechanism to implement these behaviors, but the behaviors will be the same for this model as for the general model. It is very desirable for similar objects to share a common disseminator that enables the delivery of the same set of behaviors even if the underlying formats may be different.
Figure 3. 1-Bit Image Object Model Icon Image Object ModelThe Icon Image Object Model is a special case object model for icon images. The group observed that for some types of objects, like web pages and other presentation formats, that frequently you find embedded icon images consisting of things like buttons, logos, banners, etc. It is desirable to be able to track and monitor these types of images although the delivery of such images is very primitive. This type of image did not fit either of the two other models so a third model was created just for this special class of image. The Icon Image Model has only a single datastream consisting of the icon image. The uvaIconImage disseminator is also quite simple and has only one behavior: to retrieve the icon image.
The uvaMetadata disseminator will be available on every object and will provide the capability to retrieve descriptive and administrative about the object and its content. Figure 4. Icon Image Model Image SizesThe group also discussed establishing standards for certain sizes of images to enhance uniformity when displaying images from multiple collections together and to streamline the workflow process during image digitization. There is considerable debate as to what the optimum size should be for a preview-sized(thumbnail) image, but the group felt it is highly desirable to establish a single standard for images sizes across the repository. The following sizes are proposed as guidelines for creating common images sizes used in the image object models:
Since the preview image size is frequently used in a list context where many preview images are arranged in rows or columns, having all the previews be of uniform size is desirable. For screen size images, there is a less compelling need for uniformity except perhaps in a page turner context. Here, the committee specified a range of sizes for screen size images to meet the needs of the particular image type. |