This memo presents a technique for using XML to produce XHTML documents with automatically numbered sections and references, and an automatically generated table of contents.
Copyright © 2002 The MITRE Corporation. All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to The MITRE Corporation.
This document and the information contained herein is provided on an "AS IS" basis and THE MITRE CORPORATION DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
This memo explains how to create XHTML 1.1 [XHTML11] documents using the Extensible Markup Language [XML]. The input is a structured document that contains sections, references, appendixes, and figures. Inside each element are XHTML 1.1 [XHTML11] fragments, and cross references. An XHTML document is generated using XSL Transformations [XSLT]. The sections, references, appendixes, and figures in the source document are numbered in the generated document. The generated document has an automatically generated table of contents. Finally, cross reference links are produced.
The syntax of a SIMP Doc document is specified by the SIMP Doc DTD [DTD]. This DTD makes use of the modular structure of the XHTML 1.1 DTD. The SIMP Doc DTD imports definitions from the XHTML 1.1 DTD, so that fragments within sections, references, appendixes, and figures can be validated. When given a valid SIMP Doc document, the XSL transformer will produce a valid XHTML 1.1 document.
The SIMP Doc DTD and XSLT stylesheet were developed to ease the task of documenting the Simple Instant Messaging and Presence [SIMP] Service. It is being made available under the SIMP License, an Open Source License in the hope others will find it useful.
There are two ways to learn how to create a SIMP Doc source file. You can browse this as an XHTML document, or view the XML that was used to produce the XHTML document. I recommend employing both methods simultaneously.
A SIMP Doc source file starts with the following prolog.
<?xml version="1.0"?>
<!DOCTYPE sd:simpdoc
PUBLIC "-//MITRE//DTD SIMP Doc 1.1//EN"
"http://simp.mitre.org/simpdoc/simpdoc11.dtd">
A SIMP Doc source file may contain an XML stylesheet processing instruction.
<?xml-stylesheet
href="http://simp.mitre.org/simpdoc/simpdoc11.xsl"
type="application/xslt+xml"?>
A table of contents will not be generated for documents that contain the following processing instruction.
<?simpdoc toc="no"?>
In general, a table of contents will be generated for documents
that do not contain a SIMP Doc processing instruction, or contain the
string yes or true in the string that
follows the string toc=.
The section structure of this part of the document follows the
element content relations defined in the SIMP Doc DTD. That is, an
element is defined in a subsection of another element's definition
only if the former element can occur in the content of the latter
element. The elements in alphabetical order are:
sd:appendix,
sd:figure,
sd:front,
sd:head,
sd:section,
sd:simpdoc,
sd:reference,
sd:references,
sd:title, and
sd:xref.
The SIMP Doc root element is sd:simpdoc. All
elements introduced by the SIMP Doc DTD use the sd
namespace identifier.
<sd:simpdoc xmlns='http://www.w3.org/1999/xhtml' xmlns:sd='http://simp.mitre.org/simpdoc'>... content ...
</sd:simpdoc>
The sd:simpdoc element contains in order, at most one
sd:head element, an sd:front element, one or
more sd:section elements, any number of
sd:references elements, and finally, any number of
sd:appendix elements.
The sd:head element is used to insert
meta, link, and object elements
into the head element of the generated document.
<sd:head> <meta http-equiv="Content-type" content="text/html; charset=iso-8859-1" /> <link rel="stylesheet" title="simpdoc" type="text/css" href="simpdoc.css"/> </sd:head>
The sd:front element is used to insert front matter
into the generated document before the table of contents.
The sd:front element contains in sequence, block
content as described in [Section 3.1.3], a required
sd:title element, followed by more block content.
The sd:title element is used to provide the title of
the document.
The sd:section element is used to divide the body of a
document into sections. The element has a required title
attribute, and an optional id attribute. The
title attribute names the section, and when present, the
id attribute can be used as the target of a cross
reference via either a SIMP Doc sd:xref element [Section 3.1.3.2], or an XHTML a element.
The content of an sd:section element is block content
followed by any number of sd:section elements. The
sd:section elements within a section define the section's
subsections, that is, the section-subsection relationships are derived
from the parent-child relationships of the sd:section
elements.
Block content consists of a sequence of XHTML block elements and
the sd:figure element in any order. The XHTML block
elements include header elements (h1, h2,
..., h6), list elements (ul,
ol, and dl), the form element,
block structure elements (p and div), and
block phrase elements (pre, blockquote, and
address).
The sd:figure element is used to name, number, and
display contents. The element has an optional title
attribute, and an optional id attribute. When present,
the title attribute names the figure. The
id attribute can be used as the target of a cross
reference via either a SIMP Doc sd:xref element [Section 3.1.3.2], or an XHTML a element.
The content of an sd:figure element is XHTML block
content as described in [Section 3.1.3]. A
sd:figure element cannot contain an
sd:figure element.
Examples of the use of the sd:figure element
follow. The first example has no attributes.
<sd:figure><pre> ... </pre></sd:figure>
An example with just the title attribute.
<sd:figure title="A Titled Figure"><pre> ... </pre></sd:figure>
An example with only the id attribute.
<sd:figure id="an.anchored.untitled.figure"><pre> ... </pre></sd:figure>
An example with both the title and id
attributes.
<sd:figure title="An Anchored Titled Figure" id="an.anchored.titled.figure"><pre> ... </pre></sd:figure>
Figures that have an id attribute can be the target of
a cross reference via either a SIMP Doc sd:xref element
[Section 3.1.3.2], or an XHTML a element.
The sd:xref element is used to cross reference an
element within a document. The element has a required
target attribute. The value of the attribute must match
the id attribute of exactly one element in the document,
the target element. The element type of the target can be any of
the following:
sd:section,
sd:figure,
sd:references,
sd:reference, and
sd:appendix.
For example, the anchored titled figure [Figure 3] can be referenced as follows:
<sd:xref target="an.anchored.titled.figure"/>
By the way, the number for the untitled figure [Figure 2] is generated for
sd:xref.
Additional information about the cross references can be generated by including text in the element. The reference [Section 3.1.3.1, Paragraph 2] was generate as follows.
<sd:xref target="sd.figure">Paragraph 2</sd:xref>.
The sd:references element groups together a set of
references. The element has an optional title attribute,
and an optional id attribute. When present, the
title attribute names the group of references, otherwise
the title "References" is used. The id attribute can be
used as the target of a cross reference via either a SIMP Doc
sd:xref element [Section 3.1.3.2], or an XHTML
a element.
The content of an sd:references element is a sequence
of sd:reference elements.
The sd:reference element describes reference. The
element has an optional target attribute, and an optional
id attribute. When present, the target
attribute contains the referenced URI. The id attribute
can be used as the target of a cross reference via either a SIMP Doc
sd:xref element [Section 3.1.3.2], or an XHTML
a element.
The content of an sd:reference element is the same
content that is allowed in XHTML li elements. See
[References 7] for some usage examples.
The sd:appendix element has the same content and
attributes as does the sd:section element [Section 3.1.3].
The SIMP Doc source for this document was processed by an XSLT stylesheet written by the author. It was executed using a simple driver written in Java. The driver uses the XML Transform package that ships with Java 2 Standard Edition 1.4.0.
The author knows of no XSLT stylesheet that generates text from SIMP Doc source, however, the generated XHTML 1.1 document can be converted to text using Amaya, W3C's Editor/Browser. Amaya's output will require modification.
The design of the SIMP Doc DTD was influenced by Marshall Rose's DTD for writing RFCs [RFC2629]. The SIMP Doc XSLT stylesheet used ideas from the RFC2629 XSLT stlyesheet developed by Julian F. Reschke.
The Appendixes contain gibberish and simply show some features of the transformer.
goo.
gum.
Some text.
More text.
More text again.
Yet more text.
A paragraph associated with a silly title.
Still more text.