SiteExperts.com Logo Home | Community | Developer's Paradise | Jobs
User Groups | Site Tools | Site Information | Search

Inside Technique : Building Documents with XML, XSL, and CSS : XML Tips for HTML Authors

Before we explore XML and XSL we are going to provide a few authoring tips for HTML authors learning XML. Compared to HTML where basically anything goes, XML has strict rules. With XML, your document must be properly structured or you will get an error when it is read. We are going to quickly explore a few of the more common mistakes made when authoring XML.

  1. XML requires your document is well-formed.
    This means that you cannot create tags that overlap one another. For example, the following is well-formed. Notice how all the tags are properly contained within other tags:
    <b>This is 
    	<i>bold and italic</i>
    </b>
    <i>
    	and this is just italic
    </i>
    

    While valid HTML has the same requirement, HTML does accept overlapping tags. While rewriting the above example with overlapping tags will usually render as expected in most HTML browsers, it actually creates invalid XML:

    <b>This is 
    	<i>bold and italic</b>
    	and this is just italic
    </i>
    
  2. XML is case-sensitve.
    If you start a tag in lowercase, you must close the tag in lower-case. For example, while the following is valid HTML, it is invalid XML: <i>...</I>. We choose to define our XML tags entirely in lower-case.
  3. Empty tags must be specified as empty.
    An empty tag is a tag that does not have a close tag (eg., </i>). For example, the IMG, INPUT, BR are empty tags. HTML knows they are empty tags because it is built-into the HTML engine. Since XML may contain arbitrary tags that may not be tested against a DTD (the description of the document) the tags must self-document whether they are containers or not. If you create an empty element in XML, you must provide a "/" at when the tag ends. For example, if you defined an empty IMG tag in XML, you would define it as follows:
    <img src="foo.gif"/>
    

    In the same regard, all container tags must be closed. For example, in HTML you can create paragraphs simply by specifying <P> tags sequentially without closing the paragraph with a end </P> tag. In XML this would be invalid as it would cause your document to be malformed (in XML, the P's have no context and without the end P tag would be assumed to be contained within one another).

  4. All attribute values in XML must be specified in quotes.
    HTML is fairly flexible in allowing you to omit quotes when specifying attribute values (eg., <TABLE BORDER=0>). In XML, always remember your quotes, <table border="1">.

This briefly introduces a few of the differences between HTML and XML and should give you enough background to proceed through the rest of this article. Next we look at the XML elements we defined to author our article.