Versión en español

XHTML code


We'll go step by step studding the rules to write XHTML documents, while showing which HTML syntax represent a violation of the XHTML standards. We'll also give an idea of where XHTML come from and why there was a need for a change. Finally you'll learn to write XHTML code and check if your code is valid (i.e., it obey the standard).

History and origin

HTML was born in 1980 as a Tim Berners-Lee project based on the concept of hypertext, that would help researchers to share information in the form of documents among the Internet. It was implemented later in 1989 in the CERN (European Organization for Nuclear Research), the largest Internet node in Europe. From there, HTML begun it's evolution that's still not finished, going trough the versions 2.0, 3.2, 4.0 and 4.01, all of them based on SGML (a meta language used to create other languages as subsets of it).

In the other hand, XML is also a meta language (used to create other languages) and is also an SGML subset, designed to be simpler to parse and process. In these days, XML is widely used in many ways to build documents and organize information (e.g., RSS, Atom, etc.) as it provides a standard way to do it that's easier to process than SGML.

In 2000, XHTML is recommended by the World Wide Web Consortium (W3C) as the new standard version of HTML that's based on XML instead of SGML. This way we can consider XHTML as the result of mixing HTML and XML. Done this, all the benefits of XML are now inherited by HTML which makes it easier to parse and process, and therefore to be available in more platforms with reduced processing capacities (e.g., PDAs and cell phones).

Other motive to update HTML versions and to create the W3C is to restore the HTML's original purpose as a semantic language. Since it was implemented, many browser vendors begun to transform the standard in order to add more functionality to it. This turned it slowly into a more visual than semantic language, which inspired the W3C to make new standards intended to reverse this effect and take it back to it's semantic origin. XHTML 1.1 is the most recent of this updates but there are more to come.

Creating an XHTML document

The rules to create an XHTML document are simple. As it's an adaption of the HTML 4.01 version (based on SGML) to the XML format (also based on SGML), most of the things didn't change. Only a few new rules were implemented that will make the document XML compatible and some other changes intended to turn the language into a more semantic one.

XML declaration

The XML declaration is a simple line and defines the XML version and the character encoding that your document uses. It must be declared before anything in your document, even before the document type declaration (HTML !DOCTYPE tag).

Code begin<?xml version="1.0" encoding="UTF-8"?>
Code end
 

Document Type Declaration (DTD)

The Document Type Declaration is not only used for XHTML documents, but for every kind of document. In every case you should use the correct DTD via the HTML !DOCTYPE tag declaration. The DTDs for XHTML documents are four and depend on the version you'll use:

XHTML 1.0: Strict, Transitional and Frameset

Code begin<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
Code end
 

XHTML 1.1

Code begin<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
Code end
 

XML Namespace declaration

The XML Namespace declaration is a simple URL and can be defined as the value of the "xmlns" attribute for the html tag.

Code begin<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
</html>
Code end
 

General XHTML 1.0 rules

This list of rules must be considered as a list of differences between HTML and XHTML. If you have never written HTML documents before, consider this list of recommendations while you read the HTML tutorials and the HTML reference. You will find that everywhere possible this recommendations are present in this site.

  • Non-empty tags must always be closed. There is no optional closing in XHTML.
    • Valid: <p>Paragraph</p>
    • Invalid: <p>Paragraph
  • Empty tags must be correctly closed. To achieve this you can use a normal closing or you can close the tag by putting a space and a slash at the end of the start tag.
    • Valid: <img src="button.jpg"></img><img src="button.jpg" />
    • Invalid: <img src="button.jpg">
  • Tags and attributes names must be lowercase to fit into the XML case-sensitivity (except for the HTML !DOCTYPE tag).
    • Valid: <a href="http://www.htmlquick.com/tutorials.html">Anchor</a>
    • Invalid: <A Href="http://www.htmlquick.com/tutorials.html">Anchor</A>
  • The predefined values for some attributes must be lowercase due to the XML case-sensitivity.
    • Valid: <input type="submit" />
    • Invalid: <input type="SUBMIT" />
  • The attributes' values must be properly enclosed by quotes (single or double). Quotation is not optional in XHTML.
    • Valid: <span id="id1" class='important'>Text</span>
    • Invalid: <span class=important>Text</span>
  • Boolean attributes cannot be abbreviated (using only the attribute's name). As value you must specify the attribute's name.
    • Valid: <button id="button1" disabled="disabled">Execute</button>
    • Invalid: <button id="button1" disabled>Execute</button>
  • Nested elements must obey correctly to their hierarchical order.
    • Valid: <span class="double"><b>Execute</b></span>
    • Invalid: <span class="double"><b>Execute</span></b>
  • Block level elements can not be declared as content of in-line elements.
    • Valid: <div class="double"><b>Execute</b></div>
    • Invalid: <b><div class="double">Execute</div></b>
  • Some specific elements cannot be declared as content of other specific elements.
    • The "a" element must not contain other "a" elements.
    • The "pre" element must not contain the "img", "object", "big", "small", "sub" or "sup" elements.
    • The "button" element must not contain other "input", "select", "textarea", "label", "button", "form", "fieldset", "iframe" or "isindex" elements.
    • The "label" element must not contain other "label" elements.
    • The "form" element must not contain other "form" elements.
  • All ampersand symbols must be written using it's entity name (&amp;), even in URLs.
    • Valid: <a href="buysell.php?id=1&amp;sub=2">Buy &amp; sell</a>
    • Invalid: <a href="buysell.php?id=1&sub=2">Buy & sell</a>
  • Character entity references are case-sensitive due to the XML rule.
    • Valid: &#xE1; - &aacute; (for á)
    • Invalid: &#XE1; - &aAcuTe; (for á)
  • The "alt" attribute must always be present in the HTML img tag.
    • Valid: <img src="bird.jpg" alt="A bird flying"></img>
    • Invalid: <img src="bird.jpg"></img>
  • Commented text will be completely ignored by an XML parser, which means that commenting scripts or style codes to "hide" them from old browsers will be as erasing them. If the script or style code contains a character "&" or "<" they will be processed by the XML parser. To avoid this problem you can choose to declare them in external files or to use the CDATA block.
    • Valid:
      <style type="text/css">
      <![CDATA[
       p { color: blue; }
      ]]>
      </style>
    • Invalid:
      <style type="text/css">
      <!--
       p { color: blue; }
      -->
      </style>
  • The "name" attribute have been formally deprecated for the elements a, applet, form, frame, iframe, img, and map, and may be excluded in future versions.

XHTML rules for Strict DTDs

In addition to those declared previously, strict XHTML documents (XHTML 1.0 Strict and XHTML 1.1) should also follow these rules.

  • Text must not be defined directly in the body of a document (HTML body tag). Instead insert it into a paragraph, div block, or other element.
    • Valid: <body><p>Text</p></body>
    • Invalid: <body>Text</body>


Bypass footer options Send to a friend Send to a friend