Typography, pagination, lists, tables, notes, annotations

Guidelines for SGML Text Mark-up at the Electronic Text Center
David Seaman, Electronic Text Center, University of Virginia

A far more extensive list of tags can be found in the Text Encoding Initiative's Guidelines for Electronic Text Encoding and Interchange.

Typographic Tags

The TEI way of marking typography is as follows:

  • <hi rend="italics">italics </hi>
  • <hi rend="bold">boldface </hi>
  • <hi rend="underscore">underline</hi>
  • <hi rend="smallcaps">SMALL CAPS </hi>
  • <hi rend="superscript">superscript</hi>
  • <hi rend="subscript">subscript</hi>

Linguisitic Emphasis

TEI allows one to mark linguisitic emphasis as distinct from simply a change in typeface, using the <emph> tag.

  • <emph rend="italics"> italics </emph>
  • <emph rend="bold"> boldface </emph>


<q><emph rend="italics">thousands</emph> of electronic texts.</q>

NOTE: It should be noted that for The Electronic Text Center's standards, <emph> is usually not used even for linguistic emphasis. Typically, <hi> replaces <emph>.

Paragraphs, quotations, page-breaks, epigraphs, and line groups

  • <p> </p> : paragraph. This tag pair sets off information that is contained in a paragraph format: prosaic description, dialogue, etc.

  • <q> </q> : quotation. This tag pair sets off a quotation.
    • A quotation is different from dialogue. The <q> </q> tags are used only for direct citations from other textual sources.
    • Example:
      <p>...though he was not a great adept in Latin, he remembered, and well understood, the advice contained in these words:
      <q>Leve fit quod bene fertur onus:</q>
      In English,
      <q>A burden becomes lightest when it is well borne;</q></p>

  • <pb n="1" /> : page-break. A page-break tag does just that--it puts a line in the document where a page-break should appear.
    • A page-break will usually take the attribute of a number that corresponds to the textual page from which it came.
    • If pages in the text are not numbered, as is usually the case with much of the front-matter, simply divide pages with <pb />.
    • A page-break does not take a closing tag. See the note on <pb /> tag in the Empty Tags section below.

  • <epigraph> : tags epigraphs that occur in the body of a text.
    • <cit> </cit> : The citation. This opening <cit> must immediately follow the opening <epigraph> tag and the <cit> must be closed immediately before the <epigraph> is closed.
    • <bibl> </bibl> : The bibliographic reference. This pair of tags must open and close around the title of the text and or the author of the text from which the epigraph comes. If there is no bibliographic reference, the pair must still be used, they will simply be empty.
    • For a good example of how to encode epigraphs, see James Branch Cabell's The Certain Hour.

  • <lg> </lg> : line groups. Identifies a group of lines, such as a stanza or a sonnet.
    • Each line within a line group must be tagged <l> which is often given a numbered attribute: <l n="14">.
    • For a good example of how to tag a few stanzas in poetry, see Poe's Annabel Lee.

Tables and Lists

  • Tables
    <table> </table> identifies a sequence of data that needs to be organized into specific rows and columns.
    • A table must be wrapped in paragraph (<p> </p>) tags.
    • <row> </row> identifies a row in a table.
    • <cell> </cell> identifies a single cell within that row.
    • For a good example of how to tag a table, see The 1893 World's Fair.
  • Lists
    <list> </list> identifies a sequence of items organized as a list.
    • Each item within the list is encoded with an <item> tag which often has the attribute n="".
    • For a good example of how to tag a list, see Sara Cone Bryant's How to Tell Stories to Children
    • For a good example of how to tag an advertisement as a list, see Eleanor H. Porter's Miss Billy--Married

Notes & Annotations

Many scholarly works contain annotations that require marking. Wherever possible, the body of a note should appear:

  • For Prose, Verse, and Drama notes will be listed at the end of each structural division in which their <note target=" "> counterparts exist (i.e. don't place <note id>'s at the end of each page--list them so they are ALL together prior to the close of their common division.)

<note> : contains a note or annotation. Attributes include:

  • "target": used in the text body to mark a place that refers to the footnote or annotation information. The "target" identification must be identical to its corresponding "id" reference so that they point to one another. They must both begin with an alpha character and can contain numbers, a dash, or a period.
    <p>"Well, that's Barnum's.<note target="n55">[55]</note>
  • "id": marks off the actual "note" information to which the <note target=" "> refers.
    <note id="n55">Since destroyed by fire, and rebuilt farther up Broadway, and again burned down in February.</note>
  • "n": the symbol or number used to mark the note's point of attachment to the main text.

Sample values include:

  • au: note by the author of the text.
  • ed: note added by the editor of the text.

See Horatio Barber's The Aeroplane Speaks and Enrico Ferri's Criminal Sociology to see two different ways of tagging notes.

Empty Tags

Tags typically come in pairs, but some -- called empty tags -- are single markers. The most common "empty tags" you will see are:

  • The prose line-break: <lb/ >
  • The page-break: <pb />

    Other less-common empty tags are:

  • The milestone: <milestone />
  • The external pointer: <xptr />

