|
Metadata | Images (raster)
| Images (vector) | Electronic Texts
| Audio | Video
Statistical Data | Spatial Data
(raster) | Spatial Data (vector)
This document is also available as a PDF
(220K) for easy printing and reference.
Definitions for this Document
-
Access Quality Collection Master:
An uncompressed and uncorrected raw digitized file, digitized to a level
that supports the minimum standards for generating quality deliverable
files, but not to a higher preservation standard.
-
Bit-depth: The number of
colors/level of grayscale captured in the digitization.
-
Compression: When a file's
size is reduced to save storage space. Much compression is "lossy,"
which results in permanent loss of some of the data captured at the point
of digitization. Master files for all formats but video are not compressed.
-
Delivery Master: A color-corrected,
cropped, edited, or otherwise altered version of the original digital
master that will be used to generate all deliverable files.
-
Preservation Quality Collection Master:
An uncompressed and uncorrected raw digitized file, digitized to a very
high quality to support preservation.
-
Preview/Thumbnail: Highly
reduced in quality and size or duration, functions as an identifier; has
little or no output or editing value.
-
Resolution: The density of
pixels captured in the digitization of an image, or the sample rate at
which audio and video are digitized.
-
Service/Deliverable: Output
quality is reduced, to support efficient delivery to users. Multiple
files may be created at this level for multiple purposes or user communities.
When Service and Deliverable qualities are outlined separately, the distinction
resides in the assumption that Service versions are editable by the user,
but Deliverable versions are not.
Repository XML Encoding Guidelines
Mappings Between XML Standards
Metadata Decisions and Best Practices
Images Bitmap/Raster
|
Capture: Access Quality
|
|
|
|
|
|
Type of Original
|
Bit depth
|
Resolution
|
Compression
|
File Format
|
|
Books (text pages)
|
Bitonal (4-bit black and white)
|
400 ppi
|
CCITT Group 4 Fax Compression
|
TIFF
|
|
Books (illustrations or figures)
|
8-bit (grayscale) or 24-bit (color)
|
400 ppi
|
Uncompressed
|
TIFF
|
|
Slides (35mm)
|
24-bit (color)
|
300ppi @ 900% (2700ppi)
|
Uncompressed
|
TIFF
|
|
Oversized items (large books, maps, etc.)
|
24-bit (color)
|
400 ppi
|
Uncompressed
|
TIFF
|
|
Project Specific (dictated by desired use)
|
1-bit, 8-bit, or 24-bit, as appropriate
|
Resolution should ensure a minimum capture size of 3000 pixels on long
side
|
Uncompressed
|
TIFF
|
|
Capture: Preservation Quality
|
|
|
|
|
|
Type of Original
|
Bit depth
|
Resolution
|
Compression
|
File Format
|
|
Books (text pages)
|
24-bit (color)
|
600 ppi
|
Uncompressed
|
TIFF
|
|
Books (illustrations or figures)
|
8-bit (grayscale) or 24-bit (color)
|
600 ppi
|
Uncompressed
|
TIFF
|
|
Slides (35mm)
|
24-bit (color)
|
600ppi @ 900% (5400ppi)
|
Uncompressed
|
TIFF
|
|
Oversized items (large books, maps, etc.)
|
24-bit (color)
|
600 ppi unless hardware constraints limit capture to 400 ppi
|
Uncompressed
|
TIFF
|
|
Deliverables
|
|
|
|
|
|
Purpose
|
|
Resolution
|
Compression
|
File Format
|
|
Thumbnail
|
4-bit (black and white), 8-bit (grayscale) or 24-bit (color)
|
120 pixels on the longest side
|
JPEG is automatically compressed, select High or level 10 compression
|
JPEG for color or grayscale images; GIF for bitonal images.
|
|
Screen-sized
|
4-bit (black and white), 8-bit (grayscale) or 24-bit (color)
|
1024 x 768 pixels; or 650-850 pixel width with proportional height (as
appropriate) for page images
|
As above
|
JPEG for color or grayscale images; 4-bit GIF for bitonal images.
|
|
Maximum
|
|
As appropriate per project
|
0-800 pixels: 2 levels.
800-1600 pixels: 3 levels.
1600-3200 pixels: 4 levels.
3200-7000: 5 levels.
7000-10,000 pixels: 6 levels
10,000-15,000 pixels: 7 levels
15,000-20,000 pixels: 8 levels
20,000-25,000 pixels: 9 levels
Above 25,000 pixels: 10 levels
|
Mr. Sid; JPEG2000 is planned
|
Illustrations/Graphs/Charts - Vector
|
Creation
|
|
|
|
|
|
|
Purpose
|
Format
|
Compression
|
Bit depth
|
Resolution
|
Comments
|
|
Master copy
|
EPS, SVG, proprietary formats, e.g. Adobe Illustrator
|
NA
|
NA
|
NA
|
Include color reference whenever appropriate and feasible
|
|
Deliverables
|
|
|
|
|
|
|
Purpose
|
Format
|
Compression
|
Bit depth
|
Resolution
|
Comments
|
|
Deliverable
|
EPS,SVG, SWF, JPEG
|
NA
|
24-bit color, 8-bit grayscale
|
Appropriate for display of necessary information; 300 ppi if readable
printing must be supported
|
Vector images may be retained in their original format or converted
to bitmap/raster formats for delivery; use the chart above as a reference.
|
|
Thumbnail
|
JPEG
|
JPEG is automatically compressed, select High or level 10 compression
|
24-bit color; 8-bit grayscale
120 pixels on the longest side
|
72 ppi
|
|
Electronic Texts
|
Capture
|
|
|
|
|
Purpose
|
Description
|
Format
|
Standard
|
|
Structured Text Transcription
|
A literal transcription of the text, encoded in XML.
|
XML
|
TEI P4, with local modifications; follow the DTD available
at: http://www.lib.virginia.edu/digital/reports/teiPractices/dlpsPractices_postkb.html
|
|
Unstructured Text Transcription
|
Plain text that may include minimal structural or formatting information.
|
HTML, ASCII text, e.g. OCR output
|
|
|
Archival Findings Aids
|
Marked-up collection finding aids.
|
XML
|
EAD 2002; follow the guidelines at http://www.lib.virginia.edu/vhp/admin.html
and the DTD at http://text.lib.virginia.edu/bin/dtd/eadVIVA/eadVIVA.dtd
|
|
Page Images
|
If the TEI or EAD will include references to page images, select
the capture specifications from the Image Table above.
|
As appropriate from above options
|
|
|
Image Metadata
|
Descriptive and technical metadata for images. Images must meet
technical standards described in the Image Table above.
|
XML
|
GDMS; follow the guidelines and DTD at http://www.lib.virginia.edu/digital/metadata/gdms.html
|
|
Deliverables
|
|
|
|
|
Purpose
|
Description
|
Format
|
Standard
|
|
Structured Text Transcription
|
Marked-up to reflect the content and the structure of the original
document.
|
XML
|
TEI, EAD, or GDMS, as documented above.
|
|
Unstructured Text Transcription
|
Plain text that may include minimal structural or formatting information.
|
HTML, ASCII text, e.g. OCR output
|
Not delivered through the UVa Library; requires conversion to TEI,
EAD, or GDMS.
|
|
Page Image Deliverable(s)
|
If the electronic text is a transcription with dependent page image
deliverables, select the deliverable specifications from the Image Table
above.
|
As appropriate from above options
|
|
Audio
|
Creation
|
|
|
|
|
Purpose
|
Format
|
Resolution & Sample rate
|
Description
|
|
Master
|
Broadcast WAV
|
44.1 kHz, 16 bits per sample
|
Maintain channel pattern of original, e.g. stereo, mono, and multi-channel.
|
|
Deliverables
|
|
|
|
|
Purpose
|
Format
|
Resolution & Sample rate
|
Description
|
|
Service
|
MPEG 1/2 Layer 3 (.mp3); MPEG 4/AAC
|
Appropriate to type and quality or original
|
Maintain channel pattern where practical.
|
|
Deliverable
|
MPEG 1/2 Layer 3 (.mp3);; MPEG 4/AAC
|
Appropriate to delivery needs and conditions
|
|
|
Preview
|
MPEG 1/2 Layer 3 (.mp3);;
|
|
Reduce duration to create a representative sample: a "clip"
|
Video
|
Creation
|
|
|
|
|
Purpose
|
Format
|
Compression
|
Description
|
|
Master
|
NTSC DV, DV-Cam tape, Beta-SP
|
DV
|
Media should be stored in an environmentally stable location
|
|
Deliverables
|
|
|
|
|
Purpose
|
Format
|
Compression
|
Description
|
|
Service
|
Select as appropriate for use
|
Appropriate to format; and use
|
Service, i.e. editable, versions produced as required by "dubbing";
implies change of storage medium and/or format. Very large file sizes;
not network distributable.
|
|
Deliverable
|
MPEG1, MPEG2, MPEG4
|
Appropriate to format and use
|
Only highly compressed forms, network distributable.
|
|
Preview
|
MPEG4
|
Appropriate to format and use
|
Reduce duration to create a representative sample: a "clip."
|
|
Thumbnail
|
120 pixels on the longest side, JPEG
|
JPEG is automatically compressed, select High or level 10 compression
|
Representative frame: indication of content.
|
Statistical/Numeric Data
|
Purpose
|
Format
|
Comments
|
|
Master copy
|
ASCII columnar format
SPSS, STATA, SAS program code and/or machine readable text based
documentation to define data for analysis
|
ASCII delimited preferred
DDI standard metadata preferred documentation format
Following the ICPSR standard for data archiving and preservation.
|
|
Service
|
Data stored in some statistical package format (SAS, SPSS, STATA)
or in queryable SQL database system
|
Storage for access, retrieval, or extraction.
|
|
Deliverable
|
SAS, STATA, SPSS, Excel or delimited ASCII format with data map or
variable list.
|
Excel not advised for very large files. All users get documentation
built from DDI records.
|
|
Preview
|
Screen dump of 5% of records, no more than 100
|
Practice not currently in place.
|
Spatial Data - Vector
|
Purpose
|
Format
|
Comments
|
|
Master copy
|
ASCII-based exchange format such as SDTS, Arc Exchange (.e00),
ArcGenerate (.gen), or delimited text.
|
Note that two of these are tied to proprietary software formats and
are not available for all data models. SDTS is available but rarely used
in data distribution.
|
|
Service
|
Industry standard formats such as ESRI shape (.shp) or ArcInfo Coverage
model, or CAD format such as Microstation (.dgn) or AutoCAD (.dgw).
Possible storage in SQL based system through proprietary middleware (ArcSDE,
Oracle Spatial)
|
Note that ESRIs shapefile model consists of several related
files. The ArcInfo Coverage model is directory-based. RDBMS
models are still relatively new.
|
|
Deliverable
|
Industry standard formats such as ESRI shape, Arc Exchange, or CAD
formats.
|
|
|
Preview
|
GIF, JPG or other raster image format.
|
Preview graphics need to be large enough to convey the general look
of the data.
|
Spatial Data - Raster
|
Purpose
|
Format
|
Comments
|
|
Master copy
|
Photography or remote sensing imagery:
Non compressed TIF+world file or GeoTIFF (preferred), BIL, IMG (Erdas
Imagine)
|
Also applicable for geo-referenced maps. GeoTIFF retains geographic
information in TIFF header; world file does same as separate file.
|
|
|
Non-image raster data:
ASCII based storage and exchange format (Arc Exchange .e00; ArcGenerate
.gen; Spatial Data Transfer Standard (SDTS))
|
SDTS is federal standard, but not widely adopted in commercial industry
or government; format is cumbersome for further processing.
|
|
Service
|
Photography or remote sensing imagery:
GeoTIFF, BIL, IMG, SID + world file
|
|
|
Non-image raster data:
ArcExchange; GeoTIFF; native data formats (.cdo); native software
data models (ArcGRID)
|
Users will almost always need to process stored data. Tiffs
can store pixel value as color value and be converted in GIS software;
native data formats are common in federal data. GRID data model is
directory, not file-based but could be stored for access purposes.
|
|
Deliverable
|
Photography or remote sensing imagery:
GeoTIFF, BIL, IMG, SID+world file, JPG+world file
|
|
|
Non-image raster data:
Arc Exchange, native formats or models, GeoTIFF
|
|
Preview
|
JPG, GIF, or SID
|
Sizes may need to be slightly larger than those outlined for other
types of images
|
|