About Standard Generalized Markup Language (SGML)

Web pages are typically encoded with a set of tags called Hypertext Markup Language, or HTML; SGML is the parent language -- the tag-set building rules -- for HTML and for most other descriptive tag-sets.

SGML texts are comprised of plain ASCII text, combined with items in angle brackets, e.g. A Christmas Carol. For most digital library or digital publishing SGML content, the texts are encoded not in HTML but in more powerful and descriptive tagsets, such as the Text Encoding Initiative Guidelines. HTML versions are then created, often "on-the-fly", for web delivery.

Tags allow one to create richly structured documents by designating -- encoding -- such information as structural divisions (titlepage, main body of text, scene, stanza, section, date, author, etc.) or conveying information about renditional and typographical elements (changes in typeface, line breaks, etc). Crucially, the tags are composed of plain text ASCII characters, so no special software or proprietary binary code is necessary to create an SGML file. This fact ensures both long-term viability and makes the files easy to deliver across a network. Unlike, say, the WordPerfect code for italics, which is specific to that word processor and is typically lost when the text is transferred out of WordPerfect and into another format, SGML tags are simply other letters and characters typed in as part of the text, and they travel with the text if it moves from computer system to computer system.

Coming to us quickly is XML (Extensible Markup Language), a parent language like SGML, and both derived from and compatible with SGML. XML is designed to be much easier to deliver on the Internet than SGML has proved to be, and much easier for software developers to implement.

David Seaman,
University of Virginia Electronic Text Center

General SGML Resources

  • The SGML/XML Web Page: Robin Cover's excellent guide to SGML and XML resources.

  • SGML Open Home Page. SGML Open is a non-profit, international consortium of providers of products and services, dedicated to accelerating the further adoption, application, and implementation of Standard Generalized Markup Language.

  • The OCLC Fred Home Page: A free automatic Document Type Definition (DTD) creation service.