TEI Markup Practices

Digital Library Production Services (DLPS), University of Virginia Library

Overview

This document describes the local markup practices for TEI-encoded electronic texts followed by Digital Library Production Services (DLPS), University of Virginia Library. It is intended to be helpful in two main ways:

NOTE

When referring to elements <div1> through <div7> as a group, the pseudo-tag <divN> is used — for example, “The type attribute is required on <divN> elements.”


Contents


General Requirements

Including all content General Requirements
Description With very few exceptions, all printed content from the print source must be included in the electronic text. All textual data must be included in the transcription, and all non-textual (graphical) data must be included in the markup as <figure> elements.
Enforcement
Machine-enforceable: no
Exceptions The only exceptions to this rule are:
  • Running page headers — Exclude the running headers that often appear at the top of each page in a printed book. (These headers are typically very repetitive and only contain content already available elsewhere in the electronic transcription, such as the title of the book or the title of the current chapter.)
    NOTE

    In rare cases, running page headers will contain unique content (such as a summary of the content of the current page). In such cases DLPS will require that the running headers be included in the electronic text. See Running page headers.

  • Handwriting — When transcribing printed materials, handwriting (such as readers’ notes or markings) should be excluded from the transcription.
  • Gaps — Gaps in the transcription are necessary in some cases, typically either because a passage is missing from the print source (due to a missing or torn page, for example), or because the print source contains non-Western characters. Any and all omissions in the electronic transcription must be indicated by the <gap/> element. See Use of <gap/>.
XML declaration General Requirements
Description All XML documents produced for DLPS should begin with an XML declaration, which should include version with value "1.0". The encoding declaration should either be omitted or set to "UTF-8".
Remarks Omitting the encoding declaration is equivalent to specifying "UTF-8", because UTF-8 is the assumed/default encoding for all XML documents.
Example
<?xml version="1.0"?>
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_xml
Message type: error
Comments: Compliance with this practice is automated: The processing script replace_xml_decl programmatically replaces the vendor’s XML declaration with our own.
Reason While not strictly required by XML, including the XML declaration is good practice.
ASCII encoding General Requirements
Description The entire XML file (transcription and markup) must not contain any illegal control characters nor any super-ASCII characters. That is, all characters in the file must have ASCII/Unicode code points of 9 (tab), 10 (newline/linefeed), 13 (carriage return), or 32-126 (space - tilde).
Remarks All super-ASCII characters must be represented as character entities.
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_xml
Message type: error
Comments: Compliance with this practice is automated: The processing script char2ent programmatically replaces super-ASCII characters with character entities (named/mnemonic entities when available, hex numeric entities otherwise).
Reason Portability
See also Special Characters

Major Structure

Essential Structure

Composite texts Major Structure: Essential Structure
Description In rare cases, DLPS will request that a particular text be marked as a composite text, in which the usual <body> element is replaced with the <group> element, which then contains multiple <text> elements, each with its own <front>, <body>, and <back>.
Remarks DLPS will only request a composite text in the case of anthologies or collected works, where each work has its own front and/or back matter.
Example
<TEI.2>
    <teiHeader>
        . . . [metadata section supplied by DLPS to the keyboarding vendor]
    </teiHeader>
    <text>
        <front> . . . [front matter for the collection] </front>
        <group>
            <text>
                <front> . . . [front matter of first text] </front>
                <body> . . . [main body of first text] </body>
                <back> . . . [back matter of first text] </back>
            </text>
            <text>
                <front> . . . [front matter of second text] </front>
                <body> . . . [main body of second text] </body>
                <back> . . . [back matter of second text] </back>
            </text>
        </group>
        <back> . . . [back matter for the collection] </back>
    </text>
</TEI.2>
Enforcement
Machine-enforceable: semi
Method: program
Name: qa_lib_structure
Message type: warning
Comments: qa_lib_structure issues a warning if the <group> element is used, so that the QA tech can check the markup manually. No DTD or program can ensure proper use of <group>.

Structural Divisions

Top-level divisions Major Structure: Structural Divisions
Description Top-level structural divisions are always <div1>; <div> and <div0> are not used.
Enforcement
Machine-enforceable: yes
Method: DTD
All content within a div Major Structure: Structural Divisions
Description The <front>, <body>, and <back> elements must contain only <div1> elements. No content is allowed directly within <front>, <body>, or <back>.
Enforcement
Machine-enforceable: yes
Method: DTD
Typed divs Major Structure: Structural Divisions
Description The type attribute is required on <divN> elements, and it has an enumerated vocabulary of allowed values. If a division has no obvious type, the generic value "section" may be used; if "section" has already been used for a higher-level division, use "subsection".
Remarks Most of the available type values are self-explanatory, and no definition is provided for them here. Those that are not necessarily self-explanatory are:
  • bio — for brief biographic sketches of authors or other contributors
  • castlist — for a list of characters preceding a dramatic work
  • chronology — for biographical or historical timelines
  • colophon — a section at or near the end of a book, containing printing information such as name of printer (as distinct from publisher), typefaces used, etc.
  • contents — for tables of contents and for lists of illustrations, etc.
  • editorial — for opinion pieces in newspapers
  • entry — for journal entries or encyclopedia/dictionary entries
  • errata — for lists of printing errors; also called corrigenda
  • fly-title — like a half-title page, but occurs between the front matter and the body; treat as last page of front matter (not first page of body); see Half-titles, fly-titles, and divisional titles
  • frontispiece — technically, an illustration facing the title page; may also be used for any full-page illustration in the front matter, or for an illustration facing the first page of a major division within the body
  • half-title — a page preceding the title page bearing the title of the work, perhaps with a series title or volume number; see Half-titles, fly-titles, and divisional titles
  • masthead — a block of matter in a newspaper or other periodical indicating title of publication, address, list of editors or other contributors, etc.
  • plates — one or more full-page illustrations, often unnumbered or numbered independently of main pagination
  • speech — for a transcript of an oration, not for a piece of dialog in a dramatic work (for which use <sp>)
Enforcement
Machine-enforceable: semi
Method: DTD
Comments: Although the DTD requires type on divs and enumerates the allowable values, obviously the DTD cannot enforce appropriate use of the available type values.
n on divs Major Structure: Structural Divisions
Description If a division is numbered or otherwise labeled in the print source (this should be obvious from the division’s heading), record the number or label in the n attribute (in addition to transcribing it as part of the <head>). If the division does not have a number or other label associated with it, do not include the n attribute.
Remarks The value of n does not have to be strictly numeric; often it will be a roman numeral or letter.
Example page image
<div1 type="introduction">
<pb/>
<head>INTRODUCTION</head>
<div2 type="chapter" n="I">
<head>I</head>
<div3 type="section" n="1">
<head>1&mdash;IMPORTANCE OF THE PO HU T'UNG.</head>
<p>The <hi rend="italic">Po hu t'ung</hi> pretends to be the official report of the discussions <lb/>
on the Classics which were held under Imperial auspices in 79 A.D., <lb/>
<!-- continues -->
Enforcement
Machine-enforceable: no
Reason Could be useful for delivery. As is our usual practice, n is simply a label for display; n is not being used as a transcriptional space here, since the number/label of the div is transcribed in the division’s <head>.
Half-titles, fly-titles, and divisional titles Major Structure: Structural Divisions
Description Encode half-titles as <div1 type="half-title"> within <front>. Encode fly-titles as <div1 type="fly-title"> within <front>. Encode divisional titles as <head type="divisional"> within the <divN> that it precedes.
Remarks A common feature in many books is a heading (on a separate page, or at the top of the first page of the first chapter) containing the title of the work (or the title of a section of the work). There are three main types of such features:
  • A page preceding the title page and bearing the title of the work, perhaps with a series title or volume number, is a half-title page and should be marked as <div1 type="half-title"> within <front>.
  • A page similar to a half-title page but occurring between the front matter and the body is a fly-title and should be marked as <div1 type="fly-title"> as the last division within <front> (not as the first division of the <body>).
  • A page, or just an initial heading preceding other headings, similar to a half-title but occurring within the body of the work, to announce the beginning of a major section, is a divisional title. In contrast to half-title and fly-title pages within the front matter, a divisional title should not be marked with its own <divN>. Instead, the divisional title should be incorporated into the <divN> that it precedes, as a <head> element.
Example page images
<front>
<!-- ... -->
<div1 type="fly-title">
<pb/>
<head type="main">The Original Journals of Captains Meriwether <lb/>
Lewis and William Clark</head>
<ornament type="line"/>
<head type="sub"><hi rend="italic">THE JOURNALS PROPER</hi></head>
<ab type="empty" rend="none"/>
<pb/>
</div1>
</front>
<body>
<div1 type="chapter" n="I">
<pb n="3"/>
<head type="divisional"><hi rend="italic">The</hi> ORIGINAL JOURNALS OF <lb/>
LEWIS AND CLARK</head>
<head type="main"><hi rend="small-caps">Chapter</hi> I</head>
<ornament type="line"/>
<head type="sub"><hi rend="italic">FROM RIVER DUBOIS TO THE PLATTE</hi></head>
<head type="desc">Clark's Journal and Orders, January 30&ndash;July 22, 1804 <lb/>
Entries and Orders by Lewis, February 20, March 3, May 15, 20, 26, and July 8, 12</head>
<ornament type="line"/>
<div2 type="section">
<head>[PRELIMINARY MEMORANDA]</head>
<div3 type="subsection">
<head rend="left">[Clark]</head>
<p>CAPT<hi rend="super">S</hi>. LEWIS &amp; CLARK wintered at the enterance <lb/>
of a Small river opposit the Mouth of Missouri <lb/>
<!-- page continues -->
Enforcement
Machine-enforceable: no
Informal divisions Major Structure: Structural Divisions
Description It is fairly common (especially in poems, but also in prose works) to see informal divisions, indicated by a string of asterisks or periods, or by a horizontal line. Normally, such informal dividers do not indicate the beginning of a new <divN>. Instead, they should be marked as <ornament> elements.
Enforcement
Machine-enforceable: no
See also Horizontal lines
Quoted material Major Structure: Structural Divisions
Description In cases where a brief section of verse, a letter, or other material is quoted within surrounding prose text, the material should be marked as a block quotation using <q>, not as a distinct <divN>.
Enforcement
Machine-enforceable: no
See also Block quotations
Block quotations with opener/closer

Division Headings

Multiple headings Major Structure: Division Headings
Description If a division (or other feature) has more than one heading, use multiple <head> elements (rather than a single <head> with line breaks), and include the type attribute with one of these values: "main", "sub", "desc" (descriptive), "alt" (alternative), or "divisional" (for divisional titles; see Half-titles, fly-titles, and divisional titles).
Example page image

In this example, the main heading identifies the division as a chapter and gives its number, the sub-heading indicates the content of the chapter, and the descriptive heading indicates the manuscript materials represented in the chapter.

<div1 type="chapter" n="I">
<pb n="3"/>
<head type="divisional"><hi rend="italic">The</hi> ORIGINAL JOURNALS OF <lb/>
LEWIS AND CLARK</head>
<head type="main"><hi rend="small-caps">Chapter</hi> I</head>
<ornament type="line"/>
<head type="sub"><hi rend="italic">FROM RIVER DUBOIS TO THE PLATTE</hi></head>
<head type="desc">Clark's Journal and Orders, January 30&ndash;July 22, 1804 <lb/>
Entries and Orders by Lewis, February 20, March 3, May 15, 20, 26, and July 8, 12</head>
<ornament type="line"/>
<div2 type="section">
<head>[PRELIMINARY MEMORANDA]</head>
<div3 type="subsection">
<head rend="left">[Clark]</head>
<p>CAPT<hi rend="super">S</hi>. LEWIS &amp; CLARK wintered at the enterance <lb/>
of a Small river opposit the Mouth of Missouri <lb/>
<!-- page continues -->
Enforcement
Machine-enforceable: semi
Method: program
Name: qa_lib_structure
Message type: error
Comments: The DTD enumerates the possible values for type on <head>, but it doesn’t require type because it’s unnecessary if there’s only one <head>. The QA program can only verify that, if multiple heads are present, they each have the type attribute. It cannot determine that multiple heads should have been used (rather than a single <head> with line breaks, since line breaks are often needed/legitimate within a heading), or that the appropriate type values have been used.
See also Half-titles, fly-titles, and divisional titles

Title Pages

Exactly one <titlePage> Major Structure: Title Pages
Description Normally a text should have exactly one <titlePage> element.
Remarks Although a text with more or less than one <titlePage> is theoretically possible and technically allowed by the DTD, such an occurrence is extremely rare and should be regarded as an encoding error unless proven otherwise.
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_structure
Message type: warning
Title types Major Structure: Title Pages
Description When using <titlePart> to mark the parts of the title, include the type attribute, assigning one of these values: "main", "sub", "desc" (descriptive), "alt" (alternative), or "volume" (for volume information).
Enforcement
Machine-enforceable: yes
Method: DTD
Volume information on title page Major Structure: Title Pages
Description Volume information on the title page should be encoded as <titlePart type="volume">.
Remarks This rule holds true even if the volume information is separated from the title by the byline or other elements (<titlePart> is allowed outside <docTitle>).
. . . </docTitle>
<byline>By <docAuthor>BOOKER T. WASHINGTON</docAuthor></byline>
<titlePart type="volume">VOLUME I</titlePart>
Enforcement
Machine-enforceable: no
Verso of title page Major Structure: Title Pages
Description The content on the verso (reverse side) of the title page should be included within the <titlePage> element, typically inside <docImprint>.
Example page images
<titlePage>
<pb/>
<docTitle>
<titlePart type="main">THE <lb/>
UNDERGROUND RAILROAD <lb/>
FROM <lb/>
SLAVERY TO FREEDOM</titlePart>
</docTitle>
<byline>BY <lb/>
<docAuthor>WILBUR H. SIEBERT</docAuthor>
ASSOCIATE PROFESSOR OF EUROPEAN HISTORY <lb/>
IN OHIO STATE UNIVERSITY</byline>
<titlePart type="desc"><hi rend="italic">WITH AN INTRODUCTION</hi>
BY <lb/>
ALBERT BUSHNELL HART
PROFESSOR OF HISTORY IN HARVARD UNIVERSITY</titlePart>
<docImprint>
<pubPlace><hi rend="gothic">New York</hi></pubPlace>
<publisher>THE MACMILLAN COMPANY <lb/>
LONDON: MACMILLAN &amp; CO., <hi rend="small-caps">Ltd.</hi></publisher>
<docDate>1899</docDate>
<hi rend="italic">All rights reserved</hi>
<pb/>
<hi rend="small-caps">Copyright</hi>, 1898, <lb/>
<hi rend="small-caps">By THE MACMILLAN COMPANY</hi>. <lb/>
<ornament type="line"/>
Set up and electrotyped December, 1898. &emsp; Reprinted September, <lb/>
1899. <lb/>
<lb/>
<lb/>
<lb/>
<hi rend="gothic">Norwood Press</hi> <lb/>
J. S. Cushing &amp; Co. &mdash; Berwick &amp; Smith <lb/>
Norwood Mass. U.S.A.
</docImprint>
</titlePage>
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_structure
Message type: warning
Comments: This practice is machine-enforceable in the sense that the QA program checks for exactly two <pb> elements within <titlePage>.

Genres

Letters

Openers and closers Genres: Letters
Description When encoding letters, prefaces, and other such personal writings, use <opener> and/or <closer> as needed to encode the opening and closing sections of the division.
Remarks <opener> and <closer> typically contain one or more of these elements:
  • <dateline> — groups together the place, date, etc., the letter was written; normally contains at least <name type="place"> and <date value="...">
  • <date> — contains a date in any format; use the value attribute to provide the date in standardized format; see Standardized date formats
  • <name> — contains a proper name; use type to indicate "person" or "place".
  • <salute> — salutation at the beginning (e.g. “Dear Sir”) or end (e.g. “Yours sincerely”) of a letter
  • <signed> — signature at the end of a letter, preface, etc.
Example page images
<div2 type="letter">
<head>TO MRS. H. LINCOLN.<ref target="n1"><hi rend="super">1</hi></ref>
<!-- <note id="n1" place="foot"> here -->
</head>
<opener>
<dateline>
<name type="place">Weymouth</name>, <date value="1761-10-05">5 October, 1761.</date>
</dateline>
<salute>MY DEAR FRIEND,</salute>
</opener>
<p><hi rend="small-caps">Does</hi> not my friend think me a stupid girl, when <lb/>
she has kindly offered to correspond with me, that <lb/>
I should be so senseless as not to accept the offer? <lb/>

<!-- continues -->

<p>I can say, in the length of this epistle, I've made <lb/>
the golden rule mine. Pray, my friend, do not let it <lb/>
be long before you write to your ever affectionate</p>
<closer>
<signed>A. S.</signed>
<seg type="postscript" rend="block">P.S. My regards to your good man. I've no <lb/>
acquaintance with him, but if you love him, I do, <lb/>
and should be glad to see him.</seg>
</closer>
</div2>
Enforcement
Machine-enforceable: no
See also Block quotations with opener/closer
Default alignment
Postscripts Genres: Letters
Description Postscripts in letters should be encoded using <seg type="postscript" rend="block"> within <closer>.
Example See preceding example
Enforcement
Machine-enforceable: no

Verse

Indented lines Genres: Verse
Description If a line of verse is indented more than the surrounding lines, use <l rend="indent">. In cases of multiple levels of indentation, use "indent" for the first level, "indent2" for the second level, etc. potentially through "indent5".
Enforcement
Machine-enforceable: no
See also Representing alignment and indentation
Line breaks in verse Genres: Verse
Description When encoding verse it is important to distinguish between logical lines of verse and the physical presentation of those lines on the printed page. In cases where a line of verse is too long to fit on the printed page, and for that reason is continued on a second line, use <l> to mark the logical line of verse and <lb/> to mark the physical line break.
Enforcement
Machine-enforceable: no
See also Line breaks

Drama

Encoding dramatic works Genres: Drama
Description Use the standard TEI elements for encoding plays and other dramatic texts.
Remarks
  • act and scene divisions should be marked as <divN> elements
  • speeches should be marked as <sp>, with speakers marked as <speaker>
  • stage directions should be marked as <stage>
  • castlists should be marked as <div1 type="castlist"> within <front>
Enforcement
Machine-enforceable: no

Newspapers

Layout Genres: Newspapers
Description Although the layout of a newspaper page is more complex than a book page, the usual elements are used: <pb> for page breaks, <cb/> for column breaks, and <cols/> for changes in columnar layout.
Enforcement
Machine-enforceable: no
See also Page Breaks
Column Breaks
Associating related sections Genres: Newspapers
Description Because newspaper articles are often broken up and printed in two (or more) non-contiguous sections, it is necessary to link the first section of the article with the subsequent section(s). Use the id, next and prev attributes on <divN> elements to achieve this.
Example
<div1 type="article" id="a1.3" next="a14.2">
<!-- first part of newspaper article -->
</div1>

<!-- multiple pages of content here -->

<div1 type="article" id="a14.2" prev="a1.3">
<!-- continuation of article -->
</div1>
Enforcement
Machine-enforceable: no
Comments: Because the datatype of next and prev is IDREF, the DTD enforces that these attributes point to an ID somewhere in the XML document. To verify that the ID is on a certain kind of div, or to verify that next points to an element whose prev attribute points back to it, a QA program would be necessary.

Encyclopedias

Encyclopedia entries Genres: Encyclopedias
Description Encyclopedia entries typically consist mainly of prose paragraphs and do not normally pose any special markup issues. Each encyclopedia entry is a <divN> element containing one or more headings followed by paragraphs.
Enforcement
Machine-enforceable: no

Dictionaries

Simple entries Genres: Dictionaries
Description In the simplest case, a dictionary entry has minimal grammatical information and only one definition.
Example After-night, n. The time after it becomes night.
<entry>
<form><orth><hi rend="bold">After-night,</hi></orth></form>
<gramGrp><pos><hi rend="italic">n.</hi></pos></gramGrp>
<def>The time after it becomes night.</def>
</entry>

page image

Enforcement
Machine-enforceable: no
Alternate spellings and usage examples Genres: Dictionaries
Description Alternate spellings should be marked with <orth type="alt">. Usage examples should be marked with <eg>.
Remarks The <eg> element does not allow character data; instead, <eg> must contain <q> (for examples with no attributed source) or <cit> (for examples that include an attribution of the author or source text).
NOTE

Because DLPS normally uses <q> for block quotations, when using <q> in a dictionary entry please indicate <q rend="inline">, as shown in the following example.

Example Again, conj. Agen; agin: By the time that, untill: “I’ll have
     it ready agin you come.”
<entry rend="hang">
<form><orth><hi rend="bold">Again,</hi></orth></form>
<gramGrp><pos><hi rend="italic">conj.</hi></pos></gramGrp>
<form><orth type="alt"><hi rend="italic">Agen; agin:</hi></orth></form>
<def>By the time that, untill:</def>
<eg><q rend="inline">"I'll have <lb/>
it ready <hi rend="italic">agin</hi> you come."</q></eg>
</entry>

page image

Enforcement
Machine-enforceable: no
Multiple homographs and multiple meanings Genres: Dictionaries
Description More complex dictionary entries may include more than one form of the same word — that is, multiple homographs (words identical in spelling but different in meaning or pronunciation), each marked with <hom>. Entries may also include more than one meaning for the same word, in which case the information (definitions, examples, etc.) for each meaning should be grouped as a <sense>. If the senses are labeled with numbers or letters in the print source, include the label in the n attribute.
Example Against, prep. In resistance to; or defense from “They
      marched against the Spaniards.” (2.) Opposite. “Over
      against a point called Sandy Point.” Against, conj. “Keep
      ’em against I come.”
<entry rend="hang">
<hom>
<form><orth><hi rend="bold">Against,</hi></orth></form>
<gramGrp><pos><hi rend="italic">prep.</hi></pos></gramGrp>
<sense>
<def>In resistance to; or defense from</def>
<eg><q rend="inline">"They <lb/>
marched <hi rend="italic">against</hii> the Spaniards."</q></eg>
</sense>
<sense n="2">
(2.) <def>Opposite.</def>
<eg><q rend="inline">"Over <lb/>
<hi rend="italic">against</hi> a point called Sandy Point."</q></eg>
</sense>
</hom>
<hom>
<form><orth>Against,</orth></form>
<gramGrp><pos><hi rend="italic">conj.</hi></pos></gramGrp>
<eg><q rend="inline">"Keep <lb/>
'em <hi rend="italic">against</hi> I come."</q></eg>
</hom>
</entry>

page image

Enforcement
Machine-enforceable: no
Super entries Genres: Dictionaries
Description In cases where words with identical spellings (homographs) receive separate entries in the dictionary (rather than being included within a single entry), each entry should be marked as an <entry> as usual, but then the group of entries should be wrapped in a <superEntry> element.
Example page image

<superEntry>
<entry rend="hang">
<form><orth><hi rend="bold">Pitch,</hi></orth></form>
<gramGrp><pos><hi rend="italic">n.</hi></pos></gramGrp>
<def>The height of anything.</def>
<eg><q rend="inline">"The roof was ten feet <lb/>
<hi rend="italic">pitch</hi>."</q></eg>
<eg><q rend="inline">"Tester bedstead 7&half; feet <hi rend="italic">pitch</hi>."</q></eg>
</entry>
<entry rend="hang">
<form><orth><hi rend="bold">Pitch,</hi></orth></form>
<gramGrp><pos><hi rend="italic">v.</hi></pos></gramGrp>
<sense>
<def>To pitch in, to begin; set to work with promptness <lb/>
or energy.</def>
</sense>
<sense n="2">
(2.) <def><hi rend="italic">To pitch into</hi>, to attack; assault.</def>
</sense>
</entry>
<entry rend="hang">
<form><orth><hi rend="bold">Pitch,</hi></orth></form>
<gramGrp><pos><hi rend="italic">v.</hi></pos></gramGrp>
<def>To sit down; to light.</def>
<eg><q rend="inline">"I saw wild geese <hi rend="italic">pitch</hi> <lb/>
in the wheatfields."</q></eg>
</entry>
<entry rend="hang">
<form><orth><hi rend="bold">Pitch,</hi></orth></form>
<gramGrp><pos><hi rend="italic">v.</hi></pos></gramGrp>
<def>To plant.</def>
<eg><q rend="inline">"I have already <hi rend="italic">pitched</hi> my crop."</q></eg>
</entry>
</superEntry>
<superEntry>
<entry rend="hang">
<form><orth><hi rend="bold">Pitcher,</hi></orth></form>
<gramGrp><pos><hi rend="italic">n.</hi></pos></gramGrp>
<def>A vessel of various sizes with one handle and a <lb/>
lip-spout for holding water or other liquids. A basin and <lb/>
<hi rend="italic">pitcher</hi>. Never called <hi rend="italic">jug</hi>.</def>
</entry>
<entry rend="hang">
<form><orth><hi rend="bold">Pitcher,</hi></orth></form>
<gramGrp><pos><hi rend="italic">n.</hi></pos></gramGrp>
<def>The man who pitches the sheaves of wheat up on <lb/>
the cart or stack, by means of a pitch-fork.</def>
</entry>
</superEntry>
Enforcement
Machine-enforceable: no

Block-level Features

Block Quotations

Block quotations Block-level Features: Block Quotations
Description Block quotations should be encoded using the <q> element.
Remarks By block quotation we simply mean a quotation set off from the surrounding text by one or more of these typographic changes:
  • set off by line breaks
  • indented
  • in a smaller typeface

<q> is always used for block quotations, irrespective of whether or not the narrator/author attributes the quotation to an external source; that is, the <quote> element should never be used.

Enforcement
Machine-enforceable: no
See also Quoted material
<q> and quotation marks Block-level Features: Block Quotations
Description <q> should never be used to replace quotation marks. If the quotation is both set off typographically from the surrounding text and enclosed in quotation marks, use the <q> element and also include the quotation marks.
Example page image
forward into "the pen." The phrase forcibly <lb/>
recalled Milton's lines&mdash; <lb/>
<q><lg>
<l>"Blind mouths! that scarce themselves know how to hold</l>
<l>A sheep-hook, or have learned aught else, the least</l>
<l>That to the faithful herdsman's art belongs!</l>
<l>&mdash;But when they list their lean and flashy songs,</l>
<l>Grate on their scrannel pipes of wretched straw;&mdash;</l>
<l rend="indent">The hungry sheep look up, and are not fed!</l>
<l>But swoln with wind, and the rank mist they draw,</l>
<l rend="indent">Rot inwardly&mdash;and foul contagion spread."</l>
</lg></q>
"The pen" was the space immediately below <lb/>
the preachers' stand; we were therefore placed <lb/>
Enforcement
Machine-enforceable: no
Block quotations with opener/closer Block-level Features: Block Quotations
Description For block quotations requiring <opener> and/or <closer>, use <q><text><body><div1>. If the quoted text is a letter (the most common case), use <q><text><body><div1 type="letter">. If the quoted text is not a letter or other type for which an appropriate type value exists, use <q><text><body><div1 type="quotation">.
Example page image
passage of the Kansas-Nebraska act. The results secured by <lb/>
the two circulars will be seen in the following letter from <lb/>
Francis Jackson, of Boston, to his fellow-townsmen and co- <lb/>
worker, the Rev. Theodore Parker.</p>
<q><text><body><div1 type="letter">
<opener>
<dateline>
<name type="place"><hi rend="small-caps">Boston</hi></name>,
<date value="1854-08-27">Aug. 27,1854</date>.
</dateline>
<salute><hi rend="small-caps">Theodore Parker</hi>:</salute>
</opener>
<p><hi rend="italic">Dear Friend</hi>,&mdash; The contributions of the churches in behalf of <lb/>
the fugitive slaves I think have about all come in. I herewith <lb/>
<!-- letter continues -->
have been. Those societies who have contributed, I judge were <lb/>
least able to do so.</p>
<closer>
<signed><hi rend="small-caps">Francis Jackson</hi>.<ref target="n4.1"><hi rend="super">1</hi></ref>
<note id="n4.1" place="foot"><seg type="note-symbol"><hi rend="super">1</hi></seg><p>Theodore Parker's 
<hi rend="italic">Scrap-book</hi>, Boston Public Library.</p></note>
</signed>
</closer>
</div1></body></text></q>
<p>The political affiliations of underground helpers before <lb/>
1840 were, necessarily, with one or the other of the old <lb/>
Enforcement
Machine-enforceable: no
See also Openers and closers

Figures and Ornaments

Captions and associated text Block-level Features: Figures and Ornaments
Description When using the <figure> element to indicate graphical content (illustrations, photographs, maps, etc.), use the <head> element to record the caption of the figure (if any). Use the <p> element to record text (if any) that is associated with the figure but is not part of the caption.
Remarks If the graphic has a credit or byline, use the <byline> element within <figure>.
Example page image
<pb/>
<figure>
<head>ONE OF THE PIONEERS IN THE UNDERGROUND MOVEMENT IN <lb/> 
PHILADELPHIA AND NEW YORK.</head>
<p>Mr. Hopper is supposed to have resorted to underground methods as early as 1787.</p>
</figure>
Enforcement
Machine-enforceable: no
rend attribute on <figure> Block-level Features: Figures and Ornaments
Description The rend attribute is required on <figure>.
Remarks rend indicates how the figure appears on the page in the physical source (not how it is intended to appear on screen in an electronic display).

The allowed values are:

  • inline — figure occurs on same line as surrounding text content
  • block — figure is set off by line breaks, with no text to the left or right of the figure
  • page — figure occupies the entire page
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_figures
Message type: error
Reason It seems good practice inherently to record relation of figure to text. Facilitates delivery.
A figure at the start of a division: <frontispiece> Block-level Features: Figures and Ornaments
Description A <frontispiece> is a figure that occurs at or near the beginning of a structural division.
Remarks The <frontispiece> element is identical to the <figure> element except that, unlike a <figure>, a <frontispiece> can occur at the top of a <divN> element.

There are cases where an illustration is the first component in a chapter or other structural division. This occurs most often when the figure is a full-page illustration on the page immediately preceding the first page of content for that division, and the illustration is clearly related to the content that follows it, not the content that precedes it. In such cases, the illustration should be marked as a <frontispiece> at the beginning of the new <divN>, not as a <figure> at the end of the preceding <divN>.

Example page images
<div1 type="section">
<pb entity="b000234935_0130"/>
<frontispiece entity="b000234935_0130_0" rend="page">
<head><hi rend="small-caps">Patrick Henry.</hi></head>
</frontispiece>
<pb entity="b000234935_0131"><fw type="sig" place="bottom-left">8</fw></pb>
<head type="main">THE FAMOUS <lb/>
Revolution Speech of Patrick Henry,</head>
<head type="sub">DELIVERED BEFORE THE VIRGINIA CONVENTION IN ST. JOHN'S <lb/>
CHURCH, 1775.</head>
<ornament type="line"/>
<p>"Mr. President," said he, "it is natural to man to indulge <lb/>
in the illusions of hope. We are apt to shut our eyes against <lb/>
a painful truth and listen to the song of that siren, till she <lb/>
Enforcement
Machine-enforceable: no
Printer’s ornaments Block-level Features: Figures and Ornaments
Description Printer’s ornaments do not qualify as figures. Instead, printer’s ornaments should be encoded with <ornament type="ornament"/>
Enforcement
Machine-enforceable: no
Horizontal lines Block-level Features: Figures and Ornaments
Description Horizontal lines should be encoded using the <ornament> element. For true horizontal lines, set type to "line". For a string of asterisks, periods, etc. that functions as a horizontal line, set type to "characters" and include the characters as the content of the <ornament> element.
Example
<ornament type="line"/>

<ornament type="characters">*&emsp;*&emsp;*&emsp;*&emsp;*&emsp;*&emsp;*&emsp;*</ornament>
Enforcement
Machine-enforceable: semi
Method: program
Name: qa_lib_misc
Message type: error
Comments: The qa_lib_misc QA program verifies that if type="characters", the <ornament> element must contain text (must not be empty), and that if type="line" (or type="ornament"), the <ornament> element must be empty. But there is no machine-enforceable way to ensure that horizontal lines are marked properly, or marked at all.

Tables

Label cells Block-level Features: Tables
Description For cells that contain a label or heading, rather than data, use <cell role="label">.
Remarks For cells containing data, there is no need to include the role attribute; "data" is the default value.
Example page image
<div1 type="contents">
<pb/>
<head>CONTENTS <lb/>
<hi rend="small-caps">of</hi> <lb/>
VOLUME I.</head>
<table>
<row>
<cell role="label" cols="2" rend="center">CHAPTER I.</cell>
</row>
<row>
<cell role="label" cols="2" rend="right"><hi rend="small-caps">page</hi></cell>
</row>
<row>
<cell>Entrance of the Mississippi&mdash;Balize</cell>
<cell rend="right">1</cell>
</row>
<row>
<cell role="label" cols="2" rend="center">CHAPTER II.</cell>
</row>
<row>
<cell rend="hang">New Orleans&mdash;Society&mdash;Creoles and Quadroons&mdash;Voy- <lb/>
age up the Mississippi</cell>
<cell rend="right">7</cell>
</row>
<!-- continues -->
Enforcement
Machine-enforceable: no
Reason Facilitates delivery: equates to <th> in HTML.
Spanning rows or columns Block-level Features: Tables
Description If a cell occupies more than one row or column, use the rows or cols attribute, respectively, on the <cell> start-tag.
Remarks This usage is equivalent to the use of the rowspan and colspan attributes on <td> and <th> in HTML.
Example See preceding example
Enforcement
Machine-enforceable: no
Reason Facilitates delivery: equates to rowspan and colspan in HTML.
Tables vs. lists Block-level Features: Tables
Description In some cases, the choice between <table> and <list> may not be obvious, but typically any items of text that are intended to line up vertically should be encoded as a <table>. A table of contents, list of illustrations, etc. should almost always be marked up as a table.
Enforcement
Machine-enforceable: no

Lists

Nested lists Block-level Features: Lists
Description Lists can be nested (a list <item> can contain a <list>). A common use of nested lists is for indexes where each entry contains indented sub-entries.
Example page image and example markup
Enforcement
Machine-enforceable: no

Notes

Anchored note references: <ref> Block-level Features: Notes
Description In cases where the note reference is indicated by a number or symbol, as is almost always true of footnotes and endnotes, use <ref> to encode the note reference.
Remarks By note reference we mean the anchor point for the annotation within the flow of the main text, typically indicated with a superscript number or symbol. By note body we mean the content of the annotation.
Example page image and example markup
Enforcement
Machine-enforceable: semi
Method: DTD
Comments: Enforced by the DTD insofar as the DTD disallows the use of target on <note> (VENDOR mode only), thereby disallowing the use of <note target="..."> to encode the note reference (at the point of initial/vendor encoding).
Unanchored note references: <ptr/> Block-level Features: Notes
Description In cases where no number or other referencing symbol is present, as is common for marginal notes, use <ptr/> to supply an anchor point for the annotation.
Example page image and example markup
Enforcement
Machine-enforceable: no
See also Unanchored notes
target attribute required for note reference Block-level Features: Notes
Description Whether using <ref> or <ptr/> to encode the note reference, the target attribute is required, and its value must match the value of the id attribute of the corresponding <note>.
Enforcement
Machine-enforceable: semi
Method: program
Name: qa_lib_notes
Message type: error
Comments: This rule is machine-enforceable insofar as the DTD requires target on <ref> and <ptr/>, where target (datatype IDREFS) must refer to an ID within the XML document. The QA program further requires (unless place="inline") that each <note> must be pointed to by a <ref> or <ptr/> element (not by any element). But no DTD or QA program can ensure that a <ref> or <ptr/> points to the correct corresponding <note>, and in this regard this rule is unenforceable (except by painstaking human effort).
Placement of <note> elements within the XML document Block-level Features: Notes
Description With the exception of endnotes, which are already located in a separate section in the print source and should not be moved, the <note> element should be placed at the point of the note’s attachment in the main text — that is, immediately after the corresponding <ref> or <ptr/> element.
Example page image and example markup
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_notes
Message type: warning
Note symbols Block-level Features: Notes
Description When the note body includes the referencing symbol (a number, *, †, etc.), record this symbol using <seg type="note-symbol"> as the first element within <note>.
Example page image and example markup
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_notes
Message type: warning
Comments: Because a note symbol is almost always printed for anchored notes (footnotes and endnotes), the QA program issues a warning if the first child element of <note> is not <seg type="note-symbol">, unless anchored="no".
Reason Isolation of the note symbol is expected to facilitate delivery, especially if the delivery system chooses to suppress the printed note symbol and instead use the n value for display.
See also Unanchored notes
Note symbol on <note> vs. note symbol on <ref> Block-level Features: Notes
Description The content of <note><seg type="note-symbol"> should match the content of that note’s corresponding <ref>.
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_notes
Message type: warning
Comments: The reason this rule applies to “POSTKB” only is that the vendor is not authorized to make corrections. If the note symbols on <ref> and <note> do not match due to a printing error, DLPS can correct it using <corr>; the vendor cannot.
n attribute for notes Block-level Features: Notes
Description The n attribute is required on both the note reference (<ref> or <ptr/>) and the note body (<note>). Its value should be a label for display, which may or may not be equivalent to the note symbol transcribed from the print source (for note reference, content of <ref>; for note body, content of <note><seg type="note-symbol">).
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_notes
Message type: error
Comments: The QA program requires n on <note> (except in <teiHeader>), and also on <ref> and <ptr/>.

Compliance with this practice is automated: The processing script notes_n programmatically adds the n attribute on <note>, <ref>, and <ptr/> elements.

Reason Facilitates delivery. If n is always present and always carries a label for display, delivery of notes is greatly facilitated, without interfering at all with the transcriptional content.
n on note reference vs. n on note body Block-level Features: Notes
Description The value of n on <note> must match the value of n on that note’s corresponding <ref> or <ptr/>.
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_notes
Message type: error
id and place on <note> Block-level Features: Notes
Description The id and place attributes are required on <note>, except in <teiHeader>.
Remarks When creating IDs for notes, use a simple, compact, human-readable scheme. For notes that are already numbered in the print source, include the number in the ID. Base the ID scheme on the referencing scheme used in the print source:
  • numbered sequentially throughout — If the notes are numbered sequentially throughout the entire work, use the printed note number, prefixed with “n” (short for “note”). For example, "n1", "n2", "n3", etc.
  • numbered within each chapter — If the note numbering starts over at 1 in each chapter, create a unique ID by including the chapter number as well as the note number. For example, the note IDs for the third chapter would be "n3.1", "n3.2", "n3.3", etc.
  • identified on each page — If the note numbers or symbols start over on each page, create a unique ID by numbering the notes sequentially within each chapter. For example, if the 35th footnote in chapter VI is indicated by an asterisk (*), the id would be "n6.35" and the note body would be marked: <note id="n6.35" place="foot"><seg type="note-symbol">*</seg>...</note>

The possible values for place are:

  • end — note appears in a separate division containing all the notes for a given section (endnotes)
  • foot — note appears in bottom margin (footnotes)
  • left — note appears in left margin (marginal notes)
  • right — note appears in right margin (marginal notes)
  • head — note appears in top margin; rarely used
  • inline — note appears within the main body of the text
  • above — note appears between the lines of the main text, above the line to which it refers; rarely used
  • below — note appears between the lines of the main text, below the line to which it refers; rarely used
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_notes
Message type: error
Comments: Because <note> is needed in <teiHeader>, the requirement of id and place cannot be DTD-enforced at the post-keyboarding stage (as it is for vendor markup). In the context of <teiHeader>, id is not necessary (no <ref> refers to the <note>) and place is not applicable (there is no physical placement, no print source for the note).
Unanchored notes Block-level Features: Notes
Description If a <note> is unanchored (has no corresponding note reference — that is, is pointed to by <ptr/> rather than <ref>; typically marginal notes), include the anchored attribute with value "no".
Example page image and example markup
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_notes
Message type: error
Reason Explicitness; could facilitate delivery
See also Unanchored note references: <ptr/>
Multiple references to a single note Block-level Features: Notes
Description It is possible for multiple <ref> or <ptr/> elements to point to a single <note> element. In such cases, the <note> element should not be repeated. Instead, the <note> should be transcribed once and placed after the first <ref> or <ptr/> that points to that note.
Example page image and example markup
Enforcement
Machine-enforceable: no

Other Block-level Features

Other features Block-level Features: Other Block-level Features
Description Arguments, bibliographic citations, epigraphs, and trailers should be marked using the appropriate TEI elements.
Example An epigraph containing a quotation, along with attribution of its source:
<epigraph>
<cit>
<q>"I have sworn upon the altar of God <lb/>
eternal hostility against every form of tyranny <lb/>
over the mind of man."</q>
<bibl><author> &mdash; <hi rend="italic">Thomas Jefferson.</hi></author></bibl>
</cit>
</epigraph>

A trailer:

<trailer>FINIS.</trailer>
Enforcement
Machine-enforceable: no
Arguments Block-level Features: Other Block-level Features
Description An argument is a block of content near the beginning of a <divN> containing a summary or overview of that section. It is often (though certainly not always) presented as a series of topics separated by long dashes. An argument should be marked as an <argument>, not as a second <head> or as the first <p>.
Example page image
<div1 type="chapter" n="VIII">
<pb n="101"/>
<head>CHAPTER VIII.</head>
<argument>
<p><hi rend="italic">Absence of public and private Amusement&mdash;Churches <lb/>
and Chapels&mdash;Influence of the Clergy&mdash;A Revival.</hi></p>
</argument>
<p><hi rend="small-caps">I never</hi> saw any people who appeared to live so <lb/>
much without amusement as the Cincinnatians. <lb/>
Enforcement
Machine-enforceable: no

Phrase-level Features

Changes in Typeface

Changes in typeface Phrase-level Features: Changes in Typeface
Description With the exception of foreign phrases, the vendor has been instructed to mark changes in typeface as physical changes, not with a logical element such as <emph>, <title>, <term>, <mentioned>, etc. At the post-keyboarding stage, this practice continues. For the sake of consistency, and as a matter of practical necessity, DLPS does not normally undertake the enhanced, logical markup of changes in typeface.
Remarks The most common values for the rend attribute on <hi> are:
  • italic
  • bold
  • underline
  • super — superscript
  • sub — subscript
  • small-caps
  • gothic

Less common but valid values are:

  • line-through
  • open
  • overline
  • red-letter
  • roman — assumed and not normally necessary
  • script
  • slash-through
  • other — indicate rendering using the other attribute, as in: <hi rend="other" other="..."> (Of course, <hi> should be used in this way only as a last resort.)

The value "gothic" should be used for the gothic or black-letter style of typeface. In modern printed books, gothic type is typically used to highlight a name or brief passage.

page image

<docImprint>
<pubPlace><hi rend="gothic">New York</hi></pubPlace>
<publisher>THE MACMILLAN COMPANY <lb/>
LONDON: MACMILLAN &amp; CO., <hi rend="small-caps">Ltd.</hi></publisher>
<docDate>1899</docDate>
<hi rend="italic">All rights reserved</hi>
Enforcement
Machine-enforceable: no
Comments: Enforced by the DTD in that rend is required on <hi>, and rend has an enumerated vocabulary. But as always, the appropriate use of <hi> and its available rend values is not machine-enforceable.
Reason While ideally changes in typeface would always be encoded with the appropriate logical element, in practice this is not feasible for vendor-produced markup. The encoders may not be native speakers of English and should not be expected to make the appropriate semantic distinctions or determine the author’s rhetorical intention. Instead, such distinctions should be reserved for second-pass markup by a native speaker of English, perhaps even a subject-matter specialist. Note, however, that such second-pass markup is not a part of the DLPS workflow. Normally a text will receive such enhanced, logical markup only if someone outside DLPS (faculty member, Info Comm, Etext Center, etc.) happens to take an interest in the text and brings the resources (funding, staff, etc.) to undertake the additional markup.
Small caps Phrase-level Features: Changes in Typeface
Description Text that is printed in small caps should be transcribed using both upper-case and lower-case letters, not all upper-case letters.
Example page image
<div2 type="chapter">
<pb n="184"/>
<head>CONNECTICUT.</head>
<p>There was no press in this colony until 1709; and, I <lb/>
believe, not more than four printing houses in it before <lb/>
1775.</p>
<div3 type="section">
<head><hi rend="small-caps">New London</hi>.</head>
<p>The first printing done in Connecticut was in that town; <lb/>
<!-- continues -->
Enforcement
Machine-enforceable: no
Reason This is the simplest way to distinguish the fully capitalized letters from the small-caps letters. The alternative method of marking small caps would be to mark only the small-caps letters, leaving the fully capitalized letters unmarked, for example:
H<hi rend="small-caps">ERE</hi>, A<hi rend="small-caps">BANDON</hi>
A<hi rend="small-caps">LL</hi> H<hi rend="small-caps">OPE</hi>

rather than simply

<hi rend="small-caps">Here, Abandon All Hope</hi>

The former is ugly, tag-heavy, and unnecessary.

Alignment and Indentation

Representing alignment and indentation Phrase-level Features: Alignment and Indentation
Description When indicating alignment or indentation, use the rend attribute, either on structural elements (<p>, <l>, <cell>, <item>, etc.) or on <hi>, as appropriate to the situation.
Remarks For indicating alignment, the available rend values are:
  • center
  • left — assumed and not normally necessary
  • right

For indicating indentation, the available rend values are:

  • indent
  • indent2, indent3, indent4, indent5 — for cases where more than one level of indentation needs to be recorded (Use these values sparingly, and only when "indent" has already been used. Normally these values are only needed when encoding lines of verse.)
  • hang — for hanging indentation — that is, when the first line of content is further left than subsequent lines; common in lists, such as indexes
Enforcement
Machine-enforceable: no
Comments: Enforced by the DTD in the sense that rend has an enumerated vocabulary, but as always, the appropriate use of the available rend values is not machine-enforceable.
Reason Originally the DTD Practices Group recommendations limited global rend to "block", "inline", or "none", with the intention of encouraging the nesting of typographic markup within structural markup (for example, <p><hi rend="bold">...</hi></p> rather than <p rend="bold">...</p>), partly for inherent logical/semantic reasons and partly to facilitate writing stylesheets for delivery. In practice this came to seem counterintuitive for marking alignment and indentation, especially for marking lines of verse, where indentation markup is common. We decided to consider alignment and indentation as properties of the elements themselves, not as a form of highlighting (for example, <l rend="indent">...</l> is preferable to <l><hi rend="indent">...</hi></l>, since the line itself is indented; the text within the line is not being highlighted). That is, we decided to distinguish between display ("block", "inline", or "none"), alignment ("center", "left", or "right"), and indentation ("indent", "indent2" etc., or "hang") on the one hand, and typographic changes on the other.
Default alignment Phrase-level Features: Alignment and Indentation
Description Some elements have a presumed or default alignment and do not normally require explicit alignment markup.
Remarks
  • <head> — center
  • <table> — center
  • <figure> — center
  • elements within <titlePage> — center
  • <trailer> — center
  • <ornament> — center
  • <dateline> — right
  • <salute> within <opener> — left
  • <salute> within <closer> — right, with some indentation toward the left
  • <signed> — right
  • all other elements — left
These elements should contain alignment markup only when the layout of the element on the printed page differs from the defaults listed above.
Enforcement
Machine-enforceable: no
Comments: This practice is unenforceable. Its usefulness (apart from saving keystrokes) depends entirely on the delivery system being aware of these default alignments.

Foreign Phrases

Use of <foreign> Phrase-level Features: Foreign Phrases
Description Words or phrases that are both (a) typographically distinct (usually in italics), and (b) not in the main language of the text, should be marked with the <foreign> element.
Enforcement
Machine-enforceable: no
Reason The reasons for requiring that foreign phrases must be typographically distinct to warrant <foreign> are that (a) the keyboarding vendor can know unambiguously when to use <foreign>, and (b) if the phrase is not italicized or otherwise distinct, it is presumably so common as to have lost its foreign-ness, at least in the author’s (or copy editor’s, or typesetter’s) estimation.
lang attribute on <foreign> Phrase-level Features: Foreign Phrases
Description Whenever possible, include the lang attribute on <foreign>, using one of the standard ISO 639-2 three-character language codes. In cases where the language is not obvious, mark the phrase with <foreign> but omit lang.
Remarks Commonly used ISO 639-2 codes include:
  • fre — French
  • ger — German
  • grc — Greek, ancient (to 1453)
  • gre — Greek, modern (1453- )
  • heb — Hebrew
  • ita — Italian
  • lat — Latin
  • rus — Russian
  • spa — Spanish
Enforcement
Machine-enforceable: no
Comments: While a QA program could issue a warning if a <foreign> element has no lang attribute, we have not deemed it a good use of the QA tech’s time to manually double-check/second-guess the vendor’s omission of the lang attribute.
Declaring languages Phrase-level Features: Foreign Phrases
Description Each language identified by a lang attribute (on <foreign>, or on any other element) must be declared in a <language> element within the <teiHeader>.
Example In the <teiHeader>:
<profileDesc>
<langUsage>
<language id="eng" usage="main">English</language>
<language id="fre">French</language>
</langUsage>
</profileDesc>

In the body of the text:

This the reader is willing to <lb/>
accept as a possible occurrence; but when she <lb/>
goes on to say that having completed his dem- <lb/>
onstration, this person triumphantly replaced <lb/>
his pencil-case, and with his feet upon the chimney- <lb/>
piece whistled Yankee Doodle, <foreign lang="fre"><hi rend="italic">c'est un peu <lb/>
trop fort</hi></foreign>, and we are probably justified in set- <lb/>
ting it down as a bit of literary colour.</p>
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_foreign
Message type: error
Comments: Although the lang attribute is declared as type IDREF, a validating parser can only verify that the value of lang corresponds to an ID somewhere in the XML document. The QA program checks whether lang actually corresponds to a <language id="..."> within teiHeader/profileDesc/langUsage.
Retaining typographic distinction Phrase-level Features: Foreign Phrases
Description Using the <foreign> element does not eliminate the need to encode the change in typeface using <hi>.
Remarks Since foreign phrases are usually italicized, typical markup for a foreign phrase will be: <foreign lang="..."><hi rend="italic">...</hi></foreign>
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_foreign
Message type: warning
Reason Explicitness. This approach is preferable to assuming that foreign content is italicized, on the same principle as avoiding the use of default values for attributes: if there’s a default value, encoders tend to ignore the attribute altogether and thus forget to use it when it’s actually applicable, resulting in erroneous (or missing) markup.
Not roman but not Asian Phrase-level Features: Foreign Phrases
Description Languages such as Greek, Hebrew, and Russian fall into a special category. They require non-roman characters, but they are alphabetic, not ideographic. If the language is within the vendor’s capabilities, the foreign content should be included in the electronic transcription. If the language is not within the vendor’s capabilities, omit the characters from the transcription and use the <gap/> element to mark the location of the omitted characters.
Remarks When transcribing these kinds of languages, use the appropriate character entities, when available (namely Greek), or XML character references with Unicode hexadecimal values:
  • Greek — Use the iso-grk1.ent character entities, supplemented as needed by the accented characters in iso-grk2.ent
  • Hebrew — Use the Hebrew block of Unicode (0590 - 05FF)
  • Russian — Use the Cyrillic block of Unicode (0400 - 04FF)
Enforcement
Machine-enforceable: no
See also Use of <gap/>
Special Characters

Punctuation

Standard keyboard punctuation Phrase-level Features: Punctuation
Description Most common punctuation characters can and should be represented using their normal keyboard characters.
Remarks
  • exclamation point — !
  • dollar sign — $
  • percent sign — %
  • asterisk — *
  • opening and closing parentheses — ( )
  • hyphen — -
  • opening and closing square brackets — [ ]
  • opening and closing braces — { }
  • colon — :
  • semicolon — ;
  • double quotation mark — "
  • single quotation mark and apostrophe — '
  • comma — ,
  • period — .
  • solidus (forward slash) — /
  • question mark — ?
Enforcement
Machine-enforceable: no
Use of character entities Phrase-level Features: Punctuation
Description Marks of punctuation other than those listed above must be represented using their standard character entities.
Remarks
  • ampersand — &amp; (as required by XML)
  • vertical bar — &verbar; (since the vendor is required to use the vertical bar character ( | ) to represent line breaks; see Line Breaks)
  • em dash — &mdash; (long dash)
  • en dash — &ndash; (medium-length dash, often used to indicate a range, for example: 1783&ndash;1804)
Other marks of punctuation are available in the iso-num.ent, iso-pub.ent, and iso-tech.ent character entity sets. See Special Characters.
Enforcement
Machine-enforceable: no
Ellipses Phrase-level Features: Punctuation
Description An ellipse — a series of dots or asterisks indicating deliberately omitted text — should be indicated by a series of keyboard-character periods or asterisks. Simply use the same number of periods or asterisks used in the print source.
Enforcement
Machine-enforceable: no
Long spaces Phrase-level Features: Punctuation
Description If the print source contains an exceptionally long space that needs to be preserved (for example, to indicate a word deliberately omitted by the author), use a series of &emsp; (em space) entities.
Enforcement
Machine-enforceable: no
Spacing between sentences Phrase-level Features: Punctuation
Description Use one space character between sentences, not two, regardless of the apparent spacing in the print source.
Enforcement
Machine-enforceable: no
Reason Saves keystrokes. Multiple spaces will be collapsed to a single space by web browsers at the point of delivery anyway.

Dates

Date values Phrase-level Features: Dates
Description On <date>, the value attribute is required. On <dateRange>, the to and from attributes are required.
Enforcement
Machine-enforceable: yes
Method: DTD
Reason Provides a consistent, machine-readable way of accessing date values. Facilitates delivery, especially as regards the ability to perform full-text searches for specific dates or date ranges.
Standardized date formats Phrase-level Features: Dates
Description The values of attributes where the datatype is %ISO-date; — namely value on <date>, and to and from on <dateRange> — must be in one of the machine-readable formats listed below.
Remarks Formats:
  • YYYY-MM-DD
  • YYYY-MM
  • YYYY
  • -MM-DD
  • --DD

BC/BCE dates are represented in the same way, but prefixed with a "B".

<date value="B233" type="death">d. 233 B.C.</date>
Example
<date value="1901-08-04">4 Aug., 1901</date>
<date value="1870-03">March 1870</date>
<date value="2003">2003</date>
<date value="-01-20">Jan. 20th</date>
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_dates
Message type: error
Reason Provides a consistent, machine-readable way of accessing date values. Facilitates delivery, especially as regards the ability to perform full-text searches for specific dates or date ranges.

Other Phrase-level Features

Mathematical formulas Phrase-level Features: Other Phrase-level Features
Description Short mathematical or scientific formulas should be represented textually using the <formula> element. For long or complex formulas, omit the textual representation and instead represent the formula as a graphic using the <figure> element.
Remarks Within <formula>, mark superscript or subscript text in the usual way. Include the required notation attribute with the value "informal".

When using <figure> to represent a long formula, if the formula is printed within the flow of the surrounding text, use rend="inline". If the formula is set off by line breaks, use rend="block".

Example page image
knew not enough mathematics even to figure a formula beyond <lb/>
the schoolboy <formula notation="informal">s=gt<hi rend="super">2</hi> / 2</formula>. If
Kepler and Newton could take liberties <lb/>
with the sun and moon, an obscure person in a remote wilderness <lb/>
like La Fayette Square could take liberties with Congress, and <lb/>
venture to multiply half its attraction into the square of its time. <lb/>

Although the formula shown here is short and simple enough to be represented using <formula>, if it were a long or complex formula requiring <figure>, the markup would be as follows:

knew not enough mathematics even to figure a formula beyond <lb/>
the schoolboy <figure rend="inline"/>. If Kepler and Newton could take liberties <lb/>
with the sun and moon, an obscure person in a remote wilderness <lb/>
like La Fayette Square could take liberties with Congress, and <lb/>
venture to multiply half its attraction into the square of its time. <lb/>
Enforcement
Machine-enforceable: no
Reason The goal here is simply to provide a mechanism for informal textual representation of the occasional mathematical formula in prose humanities texts. The textual representation is human-readable, the flow of the narrative is not interrupted, and there is no need to refer the reader to the page image (which would otherwise be necessary, because DLPS does not provide cropped, bitonal figure images for inline display).

Reference Systems

Page Breaks

Marks beginning of page Reference Systems: Page Breaks
Description The <pb/> element marks the point at which a page begins, and so the <pb/> element should always be placed at the top or beginning of the markup for that page, regardless of the position of the printed page number in the print source.
Enforcement
Machine-enforceable: no
Page numbers Reference Systems: Page Breaks
Description If the page contains a printed page number, record it in the n attribute; if not, do not include the n attribute.
Enforcement
Machine-enforceable: no
Page number corrections Reference Systems: Page Breaks
Description In cases where it is necessary to correct a page number (or add any other markup to a page number), insert an <fw> (form work) element within the <pb> element, and use <corr> as usual to make the correction. Put the correct number in the n attribute.
Example
<pb n="242"/>
<!-- ... -->
<pb n="243"><fw type="pageno"><corr resp="gpm2a" sic="242">243</corr></fw></pb>
<!-- ... -->
<pb n="244"/>
Enforcement
Machine-enforceable: no
Reason We do not consider the n attribute on <pb> to be a transcriptional space; n is actually just a label for display. The n attribute is the most convenient and traditional place to record the page number, but in cases where tagging is needed around a page number, it is not possible in XML to add that tagging within an attribute value. For these reasons, we have opted to allow <fw> within <pb>. This approach is comparable to the distinction between printed note reference symbols and the n attribute on <note>. See Note symbols.
See also Corrections
Running page headers
Always within a div Reference Systems: Page Breaks
Description Page breaks must be placed within a <divN> element, never between divisions. Therefore, when a division starts on a new page, the <pb> is the first element in the division, immediately following the opening <divN> tag (preceding even the division <head>, if there is one).
Example
</div2>
<div2 type="chapter" n="II">
<pb/>
<head>II&mdash;APPELLATIONS.</head>
Enforcement
Machine-enforceable: no
Comments: The DTD disallows <pb> outside a div, but there is no way for the DTD to control exact placement of those <pb> elements within the div.
Exceptions The exception to this rule is newspapers, where the use of <pb/> is reversed: <pb/> is required to occur outside of any <divN> (<pb/> is allowed only within <body>). This practice simply fits better with the nature of printed newspapers, where each page contains large amounts of text divided into numerous different divs, and where a div almost never continues to the next page uninterrupted.
Blank pages Reference Systems: Page Breaks
Description There must be one <pb> element for every page in the set of page images for the work. This is true even for pages that have no textual or graphical content on them. If a blank page occurs between divs, place the blank page’s <pb> element as the last page of the preceding div, not as the first page of the new div.
Example
<!-- end of last chapter --></p>
<pb/> <!-- blank page between last chapter and bibliography -->
</div1>
</body>
<back>
<div1 type="bibliography">
<pb/>
<head>BIBLIOGRAPHY</head>
Enforcement
Machine-enforceable: no
Running page headers Reference Systems: Page Breaks
Description Normally running page headers should be excluded from the electronic text, but in some cases (namely, when the running headers contain unique content not already available elsewhere in the transcription, such as a summary of that page’s content) DLPS will specifically require that the running headers be preserved. To encode the running headers, use <fw type="header"> within <pb>.
Example
<pb n="99"><fw type="header">APPEAL TO CHURCHES OF MASSACHUSETTS</fw></pb>
Enforcement
Machine-enforceable: no

Column Breaks

Use of <cb/> Reference Systems: Column Breaks
Description If the print source has a single-column layout, it is not necessary to mark the column at all. For materials with multiple columns, use <cb/> to mark the beginning of each column on each page.
Enforcement
Machine-enforceable: no
Marks beginning of column Reference Systems: Column Breaks
Description Like <pb/> (page break), which always marks the beginning of the page, <cb/> should always mark the top or beginning of the column of text.
Enforcement
Machine-enforceable: no
See also Marks beginning of page
n attribute Reference Systems: Column Breaks
Description Use the n attribute on <cb/>to record the number of the column on the page. For example, if each page contains two columns, the first (left) column on each page is <cb n="1"/>, and the second (right) column is <cb n="2"/>.
Enforcement
Machine-enforceable: no
Mixed column layouts Reference Systems: Column Breaks
Description In cases where the number of columns changes mid-page, use the <cols/> element to indicate the point at which the number of columns changes. Use the n attribute to indicate the number of columns in the section that follows the <cols/> tag.
Remarks For example, if the page layout shifts from single-column to double-column in the midst of the page, use <cols n="2"/> to indicate the point at which double-column layout begins (and then use <cb n="1"/> and <cb n="2"/> to mark the columns, as usual). At the point where the layout shifts back to single-column text, use <cols n="1"/> (after which no <cb/> elements are necessary, since the layout is single-column).
NOTE

A division <head> followed by a multi-column layout does not indicate a mixed-column layout and does not require <cols n="..."/>.

page image and example markup

Example page image and example markup
Enforcement
Machine-enforceable: no

Line Breaks

Line breaks Reference Systems: Line Breaks
Description Line breaks in running prose should be preserved in the electronic transcription by marking the end of each printed line with <lb/>.
Enforcement
Machine-enforceable: no
See also Line breaks in verse

Special Considerations

Gaps and Uncertainties

Use of <gap/> Special Considerations: Gaps and Uncertainties
Description Use <gap/> to mark any section (character, word, passage, page, etc.) that is being omitted from the transcription.
Remarks There are two reasons for such omissions: the section is missing (as happens with torn or missing pages), or it has been excluded deliberately for editorial reasons. In particular, a block of non-Western characters in a language outside the vendor’s capabilities should be marked as a <gap/>. Use one <gap/> element for each unbroken section of content that is being excluded from the electronic transcription.
Example Example of use of <gap/> for non-Western characters
Enforcement
Machine-enforceable: no
See also Including all content
desc and reason attributes on <gap/> Special Considerations: Gaps and Uncertainties
Description The desc (description) and reason attributes are required on <gap/>.
Remarks The value of desc should be a brief description of the content being omitted from the transcription. The reason attribute accepts these values:
  • editorial — the section is omitted deliberately for editorial reasons (for example, non-Western characters)
  • damage — the section is omitted because of damage to the physical page (for example, a torn page)
  • missing — the section is missing entirely (for example, a missing page)
  • other — indicate the reason in the “other” attribute, as in: <gap desc="..." reason="other" other="..."/>
Example
<gap desc="Chinese characters" reason="editorial"/>
<gap desc="page 43, line 17 to end of page" reason="damage"/>
Enforcement
Machine-enforceable: yes
Method: DTD
Reason Explicitness
Use of <unclear> Special Considerations: Gaps and Uncertainties
Description Use <unclear> to mark passages that cannot be transcribed with certainty, as happens when a letter/word/phrase is physically present on the page but is unreadable (due to a printing error, physical damage to the page such as readers’ marks, or a bad scan).
Remarks When working with words or phrases marked by the vendor as <unclear>, follow these guidelines:
  • If a word marked as <unclear> by the vendor is actually legible (as can happen when a bad page image is rescanned and replaced only after the page images have shipped to the vendor), simply supply the characters necessary to complete the word and remove the <unclear> start-tag and </unclear> end-tag.
  • If the word is unclear but a reasonable supposition can be made as to the intended word, supply the characters necessary to complete the word, but leave the <unclear> and </unclear> markup in place.
  • If the word is so illegible that no reasonable supposition can be made, remove the entire illegible word/phrase and replace it with an empty <unclear/> element.
Enforcement
Machine-enforceable: no

Arbitrary Sections

Arbitrary sections Special Considerations: Arbitrary Sections
Description When none of the standard TEI elements is appropriate for a particular textual feature, use <ab> if the feature is a block element or <seg> if the feature is within a containing block element.
Remarks In texts with complex structure or layout, the encoder is likely to encounter block-level sections or phrase-level passages that are difficult to fit into any of the standard TEI elements. In such cases, it may be best to take advantage of TEI’s elements for arbitrary sections:
  • <ab> — (anonymous block) occurs at the block level (at same level as <p>, <table>, <list>, etc.)
  • <seg> — (segment) occurs at the phrase level (within <p>, <cell>, <item>, etc.)

Both of these elements accept the type attribute with any value (no predefined vocabulary).

Although these elements should be used sparingly, they are very useful when genuinely needed.

IMPORTANT

It is better to use <ab> or <seg>, when appropriate, than to inject inappropriate markup — such as <divN> elements that do not truly reflect the major structural divisions of the work, or <p> elements that are not really paragraphs — for the sake of “making it parse.”

If a work contains a particularly problematic feature for which the preferred encoding is not clear, ask DLPS for further guidance.

Enforcement
Machine-enforceable: no

Special Characters

Character-entity sets Special Considerations: Special Characters
Description Character-entity sets are not declared in the external DTD. Instead, you will need to declare and invoke any entity sets required by a given document in the document’s internal subset.
Remarks Although the character-entity files can be invoked using disk paths, we prefer the use of URLs, so that the document points to the latest version of the DTD and character-entity files.
<!DOCTYPE TEI.2 SYSTEM "http://text.lib.virginia.edu/dtd/tei/tei-p4/tei2.dtd" [
<!ENTITY % TEI.extensions.ent SYSTEM "http://text.lib.virginia.edu/dtd/tei/uva-dl-tei/uva-dl-tei.ent">
<!ENTITY % TEI.extensions.dtd SYSTEM "http://text.lib.virginia.edu/dtd/tei/uva-dl-tei/uva-dl-tei.dtd">

<!ENTITY % ISOlat1 SYSTEM "http://text.lib.virginia.edu/charent/iso-lat1.ent"> %ISOlat1;
<!ENTITY % ISOlat2 SYSTEM "http://text.lib.virginia.edu/charent/iso-lat2.ent"> %ISOlat2;
<!ENTITY % ISOnum  SYSTEM "http://text.lib.virginia.edu/charent/iso-num.ent">  %ISOnum;
<!ENTITY % ISOpub  SYSTEM "http://text.lib.virginia.edu/charent/iso-pub.ent">  %ISOpub;
<!ENTITY % ISOtech SYSTEM "http://text.lib.virginia.edu/charent/iso-tech.ent"> %ISOtech;
]>

The usual ISO 8879 (SGML) entity sets are included with the DLPS DTD files. Please do not use your own local versions of the ISO 8879 entity sets, as our versions include corrections, as well as a supplementary set containing characters not available in the standard sets (uva-supp.ent).

Enforcement
Machine-enforceable: no
Characters not in a standard entity set Special Considerations: Special Characters
Description Use the named (mnemonic) entities from the DLPS-supplied entity sets whenever possible, but if a particular character is not in one of the DLPS-supplied entity sets, declare a new character entity that resolves either to the appropriate Unicode character, or to a combination of a standard character and a Unicode combining diacritic.
Remarks If a character is not available in any of the DLPS-supplied entity sets (don’t forget to check uva-supp.ent as well as the ISO 8879 sets), it may nevertheless be available as a Unicode character. In such cases, identify the correct Unicode character and declare it as an entity with an appropriate human-readable name.

In some cases, a particular character is not available in Unicode. It is usually possible, however, to create the needed character by combining a standard character with one of the Unicode combining diacritics (block 0300–036F of Unicode).

page image and example markup

Enforcement
Machine-enforceable: no

Corrections

Adding to the file’s revision history Corrections
Description Before undertaking any significant markup correction or enhancement, add a <change> entry to the file’s <revisionDesc> (the last major element within <teiHeader>).
Remarks In the <name> element, include your first and last name (not just initials). Include the id attribute, using your UVa computing ID (which can then be used for the resp attribute on <corr> when making corrections; see resp attribute on <corr>).
Example
</profileDesc>
<revisionDesc>
<change>
<date value="2005-03-08">March 8, 2005</date>
<respStmt>
<name id="gpm2a">Greg Murray (Digital Library Production Services,
University of Virginia Library)</name>
<resp>corrector</resp>
</respStmt>
<item><p>Converted markup to validate against the DLPS DTD (uva-dl-tei)
in its post-keyboarding mode. Overhauled div structures. Corrected
note-related problems when encountered. Corrected minor printing
errors.</p></item>
</change>
</revisionDesc>
</teiHeader>
Enforcement
Machine-enforceable: no
Use of <corr> Corrections
Description If the print source contains a blatant error, the error can and should be corrected in the electronic text. When making corrections, always use the <corr> element to mark the content that has been changed.
Enforcement
Machine-enforceable: no
sic attribute on <corr> Corrections
Description Whenever possible, include the sic attribute on <corr>; its value should be the original, uncorrected word/phrase from the print source.
Remarks Normally the only time sic should be missing is when correcting an error of omission, in which case there is nothing to record in the sic attribute.
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_corr
Message type: warning
resp attribute on <corr> Corrections
Description When making corrections, include the resp (responsibility) attribute on <corr>. Its value should correspond to the id attribute of a <name> element within a <respStmt> in the <teiHeader>.
Remarks The typical scenario is to add an entry to the revision history for the file in teiHeader/revisionDesc (see Adding to the file’s revision history), then use resp to record responsibility for corrections.
We have no <corr sic="trustworty" resp="gpm2a">trustworthy</corr> statistics <lb/>
Enforcement
Machine-enforceable: yes
Method: program
Name: qa_lib_corr
Message type: warning
Comments: The DTD does not require resp on <corr>, because in cases of conversion/migration of existing markup where the corrector is unknown, it is not always possible to supply a useful value for resp. Instead, the QA program issues a warning if resp is missing. The program also verifies that the resp value corresponds to the id attribute of an element within <teiHeader>. (This test goes beyond the DTD, which specifies that resp is an IDREF and so must point to an ID somewhere within the XML document. Actually, this test is performed for any and all elements that have a resp attribute, not just <corr>.)

Complete List of Practices


Index of Elements, Attributes, and Terms

Elements

Attributes

Terms


Enforcement Programs

DTD

DTD name: The DLPS DTD (uva-dl-tei)
DTD files:
Filename: tei2.dtd
Description: TEI P4 main (driver) file; all other files in the TEI DTD distribution are also required
Disk path: /cenrepo/dtd/tei/tei-p4/tei2.dtd
URL: http://text.lib.virginia.edu/dtd/tei/tei-p4/tei2.dtd

Filename: uva-dl-tei.ent
Description: DLPS TEI modification file #1 of 2: entity declarations
Disk path: /cenrepo/dtd/tei/uva-dl-tei/uva-dl-tei.ent
URL: http://text.lib.virginia.edu/dtd/tei/uva-dl-tei/uva-dl-tei.ent

Filename: uva-dl-tei.dtd
Description: DLPS TEI modification file #2 of 2: element declarations
Disk path: /cenrepo/dtd/tei/uva-dl-tei/uva-dl-tei.dtd
URL: http://text.lib.virginia.edu/dtd/tei/uva-dl-tei/uva-dl-tei.dtd

Programs

char2ent:
Filename: char2ent
Type: processing
Language: Perl
Description: Converts super-ASCII characters to standard mnemonic character entities, when available
Disk path: /dlps_work/bin/char2ent
notes_n:
Filename: notes_n
Type: processing
Language: Perl
Description: Adds or updates the n attribute on <note>, <ref>, and <ptr/> elements.
Disk path: /dlps_work/bin/notes_n
replace_xml_decl:
Filename: replace_xml_decl
Type: processing
Language: Perl
Description: Replaces XML declaration
Disk path: /dlps_work/bin/replace_xml_decl
qa_dates:
Filename: qa_dates
Type: QA
Language: Perl
Description: QA program for standardized date values
Disk path: /dlps_work/bin/qa_dates
qa_figures:
Filename: qa_figures
Type: QA
Language: Perl
Description: QA program for <figure> elements
Disk path: /dlps_work/bin/qa_figures
qa_lib_corr:
Filename: qa_lib_corr.xsl
Type: QA
Language: XSLT
Description: QA program for corrections, additions, and deletions
Disk path: /cenrepo/bin/cgi-dl/dlps/xsl/qa_lib_corr.xsl
URL: http://text.lib.virginia.edu/bin/cgi-dl/dlps/markupQA/
qa_lib_foreign:
Filename: qa_lib_foreign.xsl
Type: QA
Language: XSLT
Description: QA program for foreign phrases and the global lang attribute
Disk path: /cenrepo/bin/cgi-dl/dlps/xsl/qa_lib_foreign.xsl
URL: http://text.lib.virginia.edu/bin/cgi-dl/dlps/markupQA/
qa_lib_misc:
Filename: qa_lib_misc.xsl
Type: QA
Language: XSLT
Description: QA program for miscellaneous requirements not handled by the other qa_lib_* stylesheets
Disk path: /cenrepo/bin/cgi-dl/dlps/xsl/qa_lib_misc.xsl
URL: http://text.lib.virginia.edu/bin/cgi-dl/dlps/markupQA/
qa_lib_notes:
Filename: qa_lib_notes.xsl
Type: QA
Language: XSLT
Description: QA program for TEI notes and note references
Disk path: /cenrepo/bin/cgi-dl/dlps/xsl/qa_lib_notes.xsl
URL: http://text.lib.virginia.edu/bin/cgi-dl/dlps/markupQA/
qa_lib_structure:
Filename: qa_lib_structure.xsl
Type: QA
Language: XSLT
Description: QA program for TEI document structure
Disk path: /cenrepo/bin/cgi-dl/dlps/xsl/qa_lib_structure.xsl
URL: http://text.lib.virginia.edu/bin/cgi-dl/dlps/markupQA/
qa_xml:
Filename: qa_xml
Type: QA
Language: Perl
Description: QA program for XML features
Disk path: /dlps_work/bin/qa_xml

About This Documentation

Maintained by: Greg Murray (gpm2a at virginia dot edu), DLPS
Overview: The XML source for this documentation describes the local markup practices for TEI-encoded electronic texts followed by Digital Library Production Services (DLPS), University of Virginia Library. It contains three types of markup practices:
  • those applicable only to keyboarding/encoding vendors (“vendor” practices)
  • those applicable only after a text has been received from a vendor (“postkb” practices)
  • those applicable to both (“global” practices)

The documentation is intended to be helpful in three main ways:

  • as a set of encoding guidelines for keyboarding/encoding vendors who produce TEI texts for DLPS
  • as a reference for DLPS staff when working on QA/correction, markup enhancement, migration, etc. of TEI texts
  • as a reference for authors of delivery systems for the digital library, when creating delivery mechanisms for TEI texts produced by DLPS
Last modified: Monday, 03-Aug-2009 15:34:13 EDT
Revision history:
Date: August 2005
Role: author
Name: Greg Murray, DLPS
Change: Produced first published version. Documented about 100 markup practices in 8 main categories and about 30 subcategories.

Date: February 24, 2006
Role: corrector
Name: Greg Murray, DLPS
Change: Minor corrections and enhancements to existing practices.

Date: July 1, 2008
Role: corrector
Name: Greg Murray
Change: Minor changes and additions.