SIMP Doc Version 1.1

John D. Ramsdell
The MITRE Corporation
ramsdell@mitre.org
2004 May 24

Abstract

This memo presents a technique for using XML to produce XHTML documents with automatically numbered sections and references, and an automatically generated table of contents.

Copyright Notice

Copyright © 2002 The MITRE Corporation. All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to The MITRE Corporation.

This document and the information contained herein is provided on an "AS IS" basis and THE MITRE CORPORATION DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Table of Contents

Introduction
SIMP Doc Prolog
2.1 DOCTYPE
2.2 The XML Stylesheet Processing Instruction
2.3 The SIMP Doc Processing Instruction
SIMP Doc Elements
3.1 The sd:simpdoc Element
3.1.1 The sd:head Element
3.1.2 The sd:front Element
3.1.2.1 The sd:title Element
3.1.3 The sd:section Element
3.1.3.1 The sd:figure Element
3.1.3.2 The sd:xref Element
3.1.4 The sd:references Element
3.1.4.1 The sd:reference Element
3.1.5 The sd:appendix Element
Processing SIMP Doc Source
Acknowledgments
References
A Fake List of References
First Fake Appendix
A.1 Section in an Appendix
Second Fake Appendix
B.1 Top Level Section
B.1.1 Next Level One
B.1.2 Next Level Two
B.1.2.1 Down a Level
B.1.2.2 A Section With a Title That Is Very Long and Tests the Appearance of Long Lines in Section Headings and in the Table of Contents
B.2 Top Level Two

1.  Introduction

This memo explains how to create XHTML 1.1 [XHTML11] documents using the Extensible Markup Language [XML]. The input is a structured document that contains sections, references, appendixes, and figures. Inside each element are XHTML 1.1 [XHTML11] fragments, and cross references. An XHTML document is generated using XSL Transformations [XSLT]. The sections, references, appendixes, and figures in the source document are numbered in the generated document. The generated document has an automatically generated table of contents. Finally, cross reference links are produced.

The syntax of a SIMP Doc document is specified by the SIMP Doc DTD [DTD]. This DTD makes use of the modular structure of the XHTML 1.1 DTD. The SIMP Doc DTD imports definitions from the XHTML 1.1 DTD, so that fragments within sections, references, appendixes, and figures can be validated. When given a valid SIMP Doc document, the XSL transformer will produce a valid XHTML 1.1 document.

The SIMP Doc DTD and XSLT stylesheet were developed to ease the task of documenting the Simple Instant Messaging and Presence [SIMP] Service. It is being made available under the SIMP License, an Open Source License in the hope others will find it useful.

There are two ways to learn how to create a SIMP Doc source file. You can browse this as an XHTML document, or view the XML that was used to produce the XHTML document. I recommend employing both methods simultaneously.

2.  SIMP Doc Prolog

A SIMP Doc source file starts with the following prolog.

2.1.  DOCTYPE

<?xml version="1.0"?>
<!DOCTYPE sd:simpdoc
    PUBLIC "-//MITRE//DTD SIMP Doc 1.1//EN"
           "http://simp.mitre.org/simpdoc/simpdoc11.dtd">

2.2.  The XML Stylesheet Processing Instruction

A SIMP Doc source file may contain an XML stylesheet processing instruction.

<?xml-stylesheet
    href="http://simp.mitre.org/simpdoc/simpdoc11.xsl"
    type="application/xslt+xml"?>

2.3.  The SIMP Doc Processing Instruction

A table of contents will not be generated for documents that contain the following processing instruction.

<?simpdoc toc="no"?>

In general, a table of contents will be generated for documents that do not contain a SIMP Doc processing instruction, or contain the string yes or true in the string that follows the string toc=.

3.  SIMP Doc Elements

The section structure of this part of the document follows the element content relations defined in the SIMP Doc DTD. That is, an element is defined in a subsection of another element's definition only if the former element can occur in the content of the latter element. The elements in alphabetical order are: sd:appendix, sd:figure, sd:front, sd:head, sd:section, sd:simpdoc, sd:reference, sd:references, sd:title, and sd:xref.

3.1.  The sd:simpdoc Element

The SIMP Doc root element is sd:simpdoc. All elements introduced by the SIMP Doc DTD use the sd namespace identifier.

<sd:simpdoc
    xmlns='http://www.w3.org/1999/xhtml'
    xmlns:sd='http://simp.mitre.org/simpdoc'>

... content ...

</sd:simpdoc>

The sd:simpdoc element contains in order, at most one sd:head element, an sd:front element, one or more sd:section elements, any number of sd:references elements, and finally, any number of sd:appendix elements.

3.1.1.  The sd:head Element

The sd:head element is used to insert meta, link, and object elements into the head element of the generated document.

<sd:head>
  <meta http-equiv="Content-type"
	content="text/html; charset=iso-8859-1" />
  <link rel="stylesheet" title="simpdoc" type="text/css"
	href="simpdoc.css"/>
</sd:head>

3.1.2.  The sd:front Element

The sd:front element is used to insert front matter into the generated document before the table of contents.

The sd:front element contains in sequence, block content as described in [Section 3.1.3], a required sd:title element, followed by more block content.

3.1.2.1.  The sd:title Element

The sd:title element is used to provide the title of the document.

3.1.3.  The sd:section Element

The sd:section element is used to divide the body of a document into sections. The element has a required title attribute, and an optional id attribute. The title attribute names the section, and when present, the id attribute can be used as the target of a cross reference via either a SIMP Doc sd:xref element [Section 3.1.3.2], or an XHTML a element.

The content of an sd:section element is block content followed by any number of sd:section elements. The sd:section elements within a section define the section's subsections, that is, the section-subsection relationships are derived from the parent-child relationships of the sd:section elements.

Block content consists of a sequence of XHTML block elements and the sd:figure element in any order. The XHTML block elements include header elements (h1, h2, ..., h6), list elements (ul, ol, and dl), the form element, block structure elements (p and div), and block phrase elements (pre, blockquote, and address).

3.1.3.1.  The sd:figure Element

The sd:figure element is used to name, number, and display contents. The element has an optional title attribute, and an optional id attribute. When present, the title attribute names the figure. The id attribute can be used as the target of a cross reference via either a SIMP Doc sd:xref element [Section 3.1.3.2], or an XHTML a element.

The content of an sd:figure element is XHTML block content as described in [Section 3.1.3]. A sd:figure element cannot contain an sd:figure element.

Examples of the use of the sd:figure element follow. The first example has no attributes.

<sd:figure><pre>
...
</pre></sd:figure>

An example with just the title attribute.

<sd:figure title="A Titled Figure"><pre>
...
</pre></sd:figure>

Figure 1.  A Titled Figure

An example with only the id attribute.

<sd:figure id="an.anchored.untitled.figure"><pre>
...
</pre></sd:figure>

An example with both the title and id attributes.

<sd:figure title="An Anchored Titled Figure" id="an.anchored.titled.figure"><pre>
...
</pre></sd:figure>

Figure 3.  An Anchored Titled Figure

Figures that have an id attribute can be the target of a cross reference via either a SIMP Doc sd:xref element [Section 3.1.3.2], or an XHTML a element.

3.1.3.2.  The sd:xref Element

The sd:xref element is used to cross reference an element within a document. The element has a required target attribute. The value of the attribute must match the id attribute of exactly one element in the document, the target element. The element type of the target can be any of the following: sd:section, sd:figure, sd:references, sd:reference, and sd:appendix.

For example, the anchored titled figure [Figure 3] can be referenced as follows:

<sd:xref target="an.anchored.titled.figure"/>

By the way, the number for the untitled figure [Figure 2] is generated for sd:xref.

Additional information about the cross references can be generated by including text in the element. The reference [Section 3.1.3.1, Paragraph 2] was generate as follows.

<sd:xref target="sd.figure">Paragraph 2</sd:xref>.

3.1.4.  The sd:references Element

The sd:references element groups together a set of references. The element has an optional title attribute, and an optional id attribute. When present, the title attribute names the group of references, otherwise the title "References" is used. The id attribute can be used as the target of a cross reference via either a SIMP Doc sd:xref element [Section 3.1.3.2], or an XHTML a element.

The content of an sd:references element is a sequence of sd:reference elements.

3.1.4.1.  The sd:reference Element

The sd:reference element describes reference. The element has an optional target attribute, and an optional id attribute. When present, the target attribute contains the referenced URI. The id attribute can be used as the target of a cross reference via either a SIMP Doc sd:xref element [Section 3.1.3.2], or an XHTML a element.

The content of an sd:reference element is the same content that is allowed in XHTML li elements. See [References 7] for some usage examples.

3.1.5.  The sd:appendix Element

The sd:appendix element has the same content and attributes as does the sd:section element [Section 3.1.3].

4.  Processing SIMP Doc Source

The SIMP Doc source for this document was processed by an XSLT stylesheet written by the author. It was executed using a simple driver written in Java. The driver uses the XML Transform package that ships with Java 2 Standard Edition 1.4.0.

The author knows of no XSLT stylesheet that generates text from SIMP Doc source, however, the generated XHTML 1.1 document can be converted to text using Amaya, W3C's Editor/Browser. Amaya's output will require modification.

5.  Acknowledgments

The design of the SIMP Doc DTD was influenced by Marshall Rose's DTD for writing RFCs [RFC2629]. The SIMP Doc XSLT stylesheet used ideas from the RFC2629 XSLT stlyesheet developed by Julian F. Reschke.

6.  References

[DTD]
The SIMP Doc 1.1 DTD.
[RFC2629]
Marshal T. Rose, "Writing I-Ds and RFCs using XML", June 1999.
[SIMP]
Simple Instant Messaging and Presence Service.
[XHTML11]
World Wide Web Consortium, "XHTML 1.1 - Module-based XHTML", May 2001.
[XML]
World Wide Web Consortium, "Extensible Markup Language (XML) 1.0 (Second Edition)", October 2000.
[XSLT]
World Wide Web Consortium, "XSL Transformations (XSLT) Version 1.0", November 1999.

7.  A Fake List of References

[TARGETLESS]
This is a fake reference with no target.
[7.2]
This is a fake reference with no target and no id.

A.  First Fake Appendix

The Appendixes contain gibberish and simply show some features of the transformer.

A.1.  Section in an Appendix

goo.

B.  Second Fake Appendix

gum.

B.1.  Top Level Section

Some text.

B.1.1.  Next Level One

More text.

B.1.2.  Next Level Two

More text again.

B.1.2.1.  Down a Level

Yet more text.

B.1.2.2.  A Section With a Title That Is Very Long and Tests the Appearance of Long Lines in Section Headings and in the Table of Contents

A paragraph associated with a silly title.

B.2.  Top Level Two

Still more text.