We'll go step by step studding the rules to write XHTML documents, while showing which HTML syntax represent a violation of the XHTML standards. We'll also give an idea of where XHTML come from and why there was a need for a change. Finally you'll learn to write XHTML code and check if your code is valid (i.e., it obey the standard).
HTML was born in 1980 as a Tim Berners-Lee project based on the concept of hypertext, that would help researchers to share information in the form of documents among the Internet. It was implemented later in 1989 in the CERN (European Organization for Nuclear Research), the largest Internet node in Europe. From there, HTML begun it's evolution that's still not finished, going trough the versions 2.0, 3.2, 4.0 and 4.01, all of them based on SGML (a meta language used to create other languages as subsets of it).
In the other hand, XML is also a meta language (used to create other languages) and is also an SGML subset, designed to be simpler to parse and process. In these days, XML is widely used in many ways to build documents and organize information (e.g., RSS, Atom, etc.) as it provides a standard way to do it that's easier to process than SGML.
In 2000, XHTML is recommended by the World Wide Web Consortium (W3C) as the new standard version of HTML that's based on XML instead of SGML. This way we can consider XHTML as the result of mixing HTML and XML. Done this, all the benefits of XML are now inherited by HTML which makes it easier to parse and process, and therefore to be available in more platforms with reduced processing capacities (e.g., PDAs and cell phones).
Other motive to update HTML versions and to create the W3C is to restore the HTML's original purpose as a semantic language. Since it was implemented, many browser vendors begun to transform the standard in order to add more functionality to it. This turned it slowly into a more visual than semantic language, which inspired the W3C to make new standards intended to reverse this effect and take it back to it's semantic origin. XHTML 1.1 is the most recent of this updates but there are more to come.
The rules to create an XHTML document are simple. As it's an adaption of the HTML 4.01 version (based on SGML) to the XML format (also based on SGML), most of the things didn't change. Only a few new rules were implemented that will make the document XML compatible and some other changes intended to turn the language into a more semantic one.
The XML declaration is a simple line and defines the XML version and the character encoding that your document uses. It must be declared before anything in your document, even before the document type declaration (HTML !DOCTYPE tag).
As this declaration may represent a compatibility problem with old browsers, you can replace it with a meta declaration (HTML meta tag) in the head of your document.
Note that the character encoding used in this example is "UTF-8" but you should specify the character encoding used by your document.
The Document Type Declaration is not only used for XHTML documents, but for every kind of document. In every case you should use the correct DTD via the HTML !DOCTYPE tag declaration. The DTDs for XHTML documents are four and depend on the version you'll use:
XHTML 1.0: Strict, Transitional and Frameset
XHTML 1.1
The XML Namespace declaration is a simple URL and can be defined as the value of the "xmlns" attribute for the html tag.
This list of rules must be considered as a list of differences between HTML and XHTML. If you have never written HTML documents before, consider this list of recommendations while you read the HTML tutorials and the HTML reference. You will find that everywhere possible this recommendations are present in this site.
<p>Paragraph</p>
<p>Paragraph
<img src="button.jpg"></img><img src="button.jpg" />
<img src="button.jpg">
<a href="http://www.htmlquick.com/tutorials.html">Anchor</a>
<A Href="http://www.htmlquick.com/tutorials.html">Anchor</A>
<input type="submit" />
<input type="SUBMIT" />
<span id="id1" class='important'>Text</span>
<span class=important>Text</span>
<button id="button1" disabled="disabled">Execute</button>
<button id="button1" disabled>Execute</button>
<span class="double"><b>Execute</b></span>
<span class="double"><b>Execute</span></b>
<b><div class="double">Execute</div></b>
<div class="double"><b>Execute</b></div>
<a href="buysell.php?id=1&sub=2">Buy & sell</a>
<a href="buysell.php?id=1&sub=2">Buy & sell</a>
á - á (for á)
á - &aAcuTe; (for á)
<img src="bird.jpg" alt="A bird flying"></img>
<img src="bird.jpg"></img>
<style type="text/css">
<![CDATA[
p { color: blue; }
]]>
</style>
<style type="text/css">
<!--
p { color: blue; }
-->
</style>
In addition to those declared previously, strict XHTML documents (XHTML 1.0 Strict and XHTML 1.1) should also follow these rules.
<body>Text</body>
<body><p>Text</p></body>
This list enumerates the differences between XHTML 1.0 Strict and XHTML 1.1.
<span xml:lang="en">Text</span>
<span lang="en" xml:lang="en">Text</span>
<a id="bookmark1">Anchor</a>
<a name="bookmark1">Anchor</a>
You can always validate your XHTML documents (as many other documents) to check that your hard work is 100% correct. You can do so at the W3C markup validation service where you can choose to validate by URL, file upload or direct input. When the result is shown, the list of errors and warnings (if presents) will let you know what and where to correct.