An Alphabet Soup of Standards

Scott Isaacs
On 6/12/1999

Abstract
We demonstrate an XML version of the first two sections of our article, Building Documents with XML, XSL, and CSS.

Table of Contents

  1. Building Documents with XML, XSL, and CSS
  2. XML Tips for HTML Authors

1. Building Documents with XML, XSL, and CSS

The current set of web recommendations and proposed recommendations is starting to look a lot like alphabet soup. Web designers currently have to deal with XML, XSL, CSS, HTML, and the DOM. In this article, we demonstrate how these technologies are very complementary and will change the future of building web-sites.

Many months ago we ran a series of articles explaining our back-end template system. The user-interface for every page on SiteExperts.com is created by a set of include files around the actual content. Depending on the change, our approach makes it relatively easy for us to quickly update the user-interface. For example, we can add a new choice to our navigation menu, move the banner ad around, or change the color scheme of the page by changing one file. However, if we want to manipulate and highlight information in an article (eg., create pull-quotes) we need to go and update the content directly.

With the introduction of XML and XSL we are starting to explore a richer and more powerful solution to managing the site and our content. To prove this, we decided to author this article completely in XML (demonstration for IE5 user's later), use XSL to transform the document to HTML, and use CSS to add additional formatting to create the final page.

This approach provides us with a number of benefits: 1) We can bring authors on we can give them a detailed set of XML elements for writing their articles. This enforces a more rigid authoring scheme making it easier for us to process the articles. 2) We are no longer tied to our templating system. Our templating system required each article to be adorned with extra template and semantic information. With XML we can separate the article from the template description. 3) By transforming the XML using XSL to HTML, we can quickly change any aspect of the page. You can create multiple views of the same document quickly and efficiently all without modifying the original page. 4) Last, using CSS, we can apply simple styling to the created HTML. This is important as simple rendering changes can be made to your document without requiring the overall XSL template to be modified or the page to be regenerated.

We are now going to take you on a tour through our use of XML, XSL, HTML, and CSS. While we will introduce many XML and XSL topics, this article is not intended as a complete tutorial to using those technologies. Rather, we hope to clearly demonstrate in detail how these technologies complement each other, their ease of use, and to leave you with some ideas on how to apply these technologies. We will write more detailed tutorials in the future.

2. XML Tips for HTML Authors

Before we explore XML and XSL we are going to provide a few authoring tips for HTML authors learning XML. Compared to HTML where basically anything goes, XML has strict rules. With XML, your document must be properly structured or you will get an error when it is read. We are going to quickly explore a few of the more common mistakes made when authoring XML.

XML requires your document is well-formed. This means that you cannot create tags that overlap one another. For example, the following is well-formed. Notice how all the tags are properly contained within other tags:

<b>This is 

	<i>bold and italic</i>

</b>

<i>

	and this is just italic

</i>

While valid HTML has the same requirement, HTML does accept overlapping tags. While rewriting the above example with overlapping tags will usually render as expected in most HTML browsers, it actually creates invalid XML:

<b>This is 

	<i>bold and italic</b>

	and this is just italic

</i>

XML is case-sensitve. If you start a tag in lowercase, you must close the tag in lower-case. For example, while the following is valid HTML, it is invalid XML: <i>...</I>. We choose to define our XML tags entirely in lower-case.

Empty tags must be specified as empty. An empty tag is a tag that does not have a close tag (eg., </i>). For example, the IMG, INPUT, BR are empty tags. HTML knows they are empty tags because it is built-into the HTML engine. Since XML may contain arbitrary tags that may not be tested against a DTD (the description of the document) the tags must self-document whether they are containers or not. If you create an empty element in XML, you must provide a "/" at when the tag ends. For example, if you defined an empty IMG tag in XML, you would define it as follows:

<img src="foo.gif"/>

In the same regard, all container tags must be closed. For example, in HTML you can create paragraphs simply by specifying <P> tags sequentially without closing the paragraph with a end </P> tag. In XML this would be invalid as it would cause your document to be malformed (in XML, the P's have no context and without the end P tag would be assumed to be contained within one another).

All attribute values in XML must be specified in quotes. HTML is fairly flexible in allowing you to omit quotes when specifying attribute values (eg., <TABLE BORDER=0>). In XML, always remember your quotes, <table border="1">.

This briefly introduces a few of the differences between HTML and XML and should give you enough background to proceed through the rest of this article. Next we look at the XML elements we defined to author our article.