Transitioning to XHTML Strict in 10 StepsAuthor: Corrie Haffly,
PixelMill PixelMill offers modular site-building solutions optimized for Web 2.0, featuring high quality templates, images, video, audio, add-ons, platform-specific solutions, training, and custom services for developers' web design needs. Transitioning to XHTML is the easiest step to understand, because it's a matter of simply following rules. When you run a site through the
W3C HTML Validator, the validator doesn't care if the site uses tables-based layout or CSS-based layout, or if your site is accessible or not. As long as the actual HTML code follows specific rules, your site will validate. You'll also be happy to know that you can transition to XHTML using FrontPage 2003 as well as Expression Web. First, you should be aware that there are two "flavors" of XHTML—XHTML Strict, which is what you'll be learning here, and XHTML Transitional. XHTML Transitional is much looser than Strict and allows you to have older types of HTML code. But there are many good reasons for following the Strict rules—for example, updating your site design will take less time in the future and your site will be more compatible with future improvements in technology. Testing Your WebsiteTo test your website and see how many rules it breaks, go to
http://validator.w3.org/ and type the URL of one of your web pages, and then click Validate. When you get to the report, you may need to revalidate after selecting the XHTML 1.0 Strict Doctype. Within Expression Web, you can also go to Tools > Compatibility Reports. Click the green arrow in the Compatibility Checker task pane and select XHTML 1.0 Strict, and then click the Check button. There may be a lot of errors listed! Don't worry—once you go through these next steps, you'll have (mostly) squeaky-clean XHTML. 10 Steps to an XHTML Strict Website- Define a DOCTYPE
The first step is to make sure that you have the proper DOCTYPE defined. The DOCTYPE is a line of code that goes at the very top of your HTML code. It tells the browser which set of rules to follow. - If your page does not already have a DOCTYPE, you will have to copy and paste this code into the very first line of your HTML code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - In Expression Web, you can set the program so that all new pages use the XHTML Strict DOCTYPE. Go to Tools > Page Editor Options. Click the Authoring tab and set the Document Type Declaration to XHTML 1.0 Strict in the drop-down list box.
- Also in Expression Web, if you have an old DOCTYPE on your page (or don't have one at all), you can highlight it, then press CTRL+ENTER to access the Code Snippets dialog box. Clickdtx1s, and the correct DOCTYPE code will automatically be added.
- Nest Tags Properly
Another key rule is that your tags have to be properly "nested." For example, if you have text that is both bold (<b>...</b>) and italicized (<i>...</i>), you need to make sure that one set of tags contains the other, like layers of an onion. For example, you can have either <b><i>bold on the outside</i></b> or <i> <b>italic on the outside</b></i> , but you can't have <b> <i>confused and improper nesting</b></i>. - If you use FrontPage's formatting tools, this will most likely not be an issue .
- You only have to worry about this if you hand-code a lot and are in the habit of improperly nesting tags. You will have to run through your code and fix those tags. Unfortunately the validator isn't smart enough to tell you when tags aren't nested properly—the error that is generated looks exactly the same as what's discussed next.
- Close HTML Tags
Next, you'll want to make sure that all of your HTML tags are properly "closed." There are two types of tags—ones that have both an opening and closing tag that contain content in between them (such as <p>…</p> , <h1>…</h1> , <b>…</b>) and ones that are "empty," without a closing tag (such as <img ...>, <br>, and <hr>). The rule for XHTML is that you have to have both opening and closing tags; empty tags need to have a slash before the last bracket, like this: <img ... />. For example, the line-break tag should be written like <br /> instead of simply <br>. - FrontPage has a bad habit of leaving the closing tag off sometimes, so you might find floating <p> tags without the </p> or floating <li> tags without the </li>.
- FrontPage also doesn't automatically add the closing slash to empty elements, either. Expression Web will add the closing slash if you add objects and elements from the task panes, but if you are used to hard-coding HTML without that closing slash, Expression Web won't add it for you.
- In the HTML Validator, this type of error is described like this:
end tag for " element name " omitted, but OMITTAG NO was specified. - The great news is that FrontPage 2003 and Expression Web will fix these errors for you, relatively easily. In the Code task pane, right-click, and then click Apply XML Formatting Rules. Those closing slashes and tags will be added for you! Run your page through the validator again and all of those errors should disappear.
- Write Code in Lowercase
Another XHTML rule is that all the HTML tags and attribute names have to be lowercase — <p>, not <P>, and the attributes should be in quotation marks . An attribute is a property that is coded within the tag. For example, this img tag has the src, width, height, and alt attributes defined: <img src="image.jpg" width="300" height="200" alt="Photo" /> Both the tag name (img) and the attribute names (src, width, height, and alt) must be lowercase, and the attributes should be surrounded with quotation marks. - Old FrontPage code is sometimes uppercase. In FrontPage 2003, you can make sure that the default code formatting is set properly by going to Tools > Page Options, clicking the Code Formatting tab, and checking "Tag names are lowercase" and "Attribute names are lowercase."
- In the HTML Validator, this type of error is described like this:
element " UPPERCASE ELEMENT NAME " undefined or attribute " UPPERCASE ATTRIBUTE NAME" undefined - Option 1: Go through your site by hand and fix the tags and attributes to be lowercase.
- Option 2: Let FrontPage or Expression Web reformat the HTML for you. First make sure the default code formatting is set properly (see first bullet point). Then, right-click in the HTML and choose Reformat HTML.
- Why would you ever choose Option 1? By using the Reformat HTML command, FrontPage/Expression Web doesn't just make uppercase tags/attributes lowercase, it also changes all of the indents and carriage returns to fit its idea of what organized code should look like. If you have a specific way of formatting your HTML code, choosing Option 1 could mess everything up! If you care about how your HTML code is formatted, you would want to choose Option 1 over Option 2. If you don't care, Option 2 is certainly less time-consuming.
- Don't Use FrontPage "bots."
Specifically, FrontPage link bars, FrontPage DHTML scripting, FrontPage forms processed by FrontPage Server Extensions, FrontPage hit counters, FrontPage Photo Galleries, and many other FrontPage components will not validate as XHTML Strict. (See the last article in this series for more information about alternatives to using FrontPage bots.) - Add Image Descriptions.
An XHTML rule which is very beneficial is that all images need to have the alt description defined . This is usually a short phrase that describes what the picture is for viewers who may have their images turned off or for blind people who use a text reader to navigate web pages. It can also help with search engine relevancy as it allows your picture to be part of your content. - In FrontPage, double-click on the picture to bring up the Picture Properties dialog box. Click the General tab and enter a short text description under the Alternative Representation > Text field.
- In Expression Web, double-click on the picture to bring up the Picture Properties dialog box. Enter a short text description under the Alternate Text dialog box.
- You can hard-code a description in the HTML by adding the alt attribute inside an image tag: <img src="image.jpg" width="200" height="100" alt="Alternative description goes here" />
- There are certain tags that are not allowed to be in other tags.
- There are some tags that are called "block elements." These typically reserve a whole "row" for their content. For example, headings and paragraphs typically have some space before and after them. Some frequently used block elements include the headings (h1, h2, h3, h4, h5, h6), paragraphs (p), horizontal rules (hr), and block quotes (blockquote). Other tags are called "inline elements," which means that they show up in the same line as the content around them. For example, bold text doesn't create its own paragraph spacing above and below it, but shows up in the context of the rest of the text. Some inline elements include bold and strong text (b, strong), italicized and emphasized text (i, em), images (img), and links (a). The basic rule to remember here is that you cannot have block elements inside of inline elements. For example, you can't have <b><p>Bold paragraph</p></b>; you have to have <p><b>Bold paragraph</b></p>.
- There are also some common-sense elements that can't be nested , as well. For example, a link element (<a href="...">Link</a>) cannot contain another link element—it's impossible to have a "link inside a link." And a form element can't contain another form inside of it.
- Finally, you can't have inline elements directly in the main <body> of a page. So, you can't have:
<body>Text in a page</body> or <body><img src="photo.jpg" ... /></body> Inline elements instead have to be inside a block-level element. Here are several perfectly valid possibilities: <body><p>Text in a paragraph</p></body> <body><div>Text in a div</div></body> <body><h1>Text in a heading</h1></body>
- Use Character Encoding for Ampersands and Symbols
Ampersands have to be represented by & in XHTML Strict. This includes ampersands that you want to show up on a page and ampersands inside of hyperlink URLs. It does not include ampersands that are used for other "special characters" such as the © code to display the copyright symbol, or for ampersands that are within an area of script. - If you have already run the Apply XHTML Formatting command, this problem has most likely already been fixed!
- This next rule has to do with blocks of scripting or styles . If you have blocks of style sheet info or JavaScript in your site, the "old school" way was to use the comment tags (<!-- and -->) like this:
<style type="text/css"> <!-- stylesheet code went here --> </style> <script type="text/javascript"> <!-- scripting went here --> </script> In XHTML Strict, however, you should replace the comment tags with this instead: <style type="text/css"> <![CDATA[ stylesheet code went here ]]> </style> <script type="text/javascript"> <![CDATA[ scripting went here ]]> </script> This may not work in certain browsers, however, so you should test your site thoroughly. It may be worth it to only use external style sheets and external JavaScript files (in a .css or .js file) to avoid the issue altogether. - Here comes the most difficult part of XHTML 1.0 Strict compliancy, and the big reason why people with older sites might choose to throw them out the window and start over from scratch (or with a pre-XHTML-compliant template). Basically, there are several attributes from old HTML that simply can't be used in XHTML Strict. Fixing these issues is sometimes easy, but most of the time you will need to integrate a style sheet in order to resolve the issue. Below, I've listed some of the more common issues that you may run into and a few ideas for how to fix them.
- Error: there is no attribute "language" or required attribute "type" not specified
This affects JavaScript tags. - Wrong:
<script language="javascript"> - Right:
<script type="text/javascript">
- Error : there is no attribute "topmargin" (or leftmargin, marginwidth, marginheight)
This is often found in older sites in the <body> tag. - Wrong:
<body topmargin="0" marginwidth="0" marginheight="0" leftmargin="0"> - Right:
HTML code: <body> Style sheet code: body { margin: 0px; padding: 0px; }
- Error: there is no attribute "border"
This affects the <img> tag. - Wrong:
<img src="picture.jpg" border="0" width="10" height="10" alt="Picture" /> - Right:
HTML code: <img src="picture.jpg" width="10" height="10" alt="Picture" /> Style sheet code: img { border: 0px; }
- Error: there is no attribute "hspace" (or "vspace")
This also affects the <img> tag. - Wrong:
<img src="picture.jpg" hspace="5" width="10" height="10" alt="Picture" /> - Right:
In the style sheet, you can create a class that adds margins, and then apply it to the image. For example, the style sheet may have this code that creates a class with a margin to the left and right of the image, instead of using the hspace attribute:
.imagewithmargin { margin-left: 5px; margin-right: 5px; }
The HTML code can then apply the special class to the image: <img src="picture.jpg" class="imagewithmargin" width="10" height="10" alt="Picture" />
- Error: there is no attribute "align"
The align attribute has in the past been used for paragraphs, headings, tables, images, and divs. - Wrong:
<p align="right" > or <img align="left" … /> - Right:
Create custom classes in the style sheet for text or image alignment and apply the classes in the HTML as needed. Below is the CSS code: .textright { text-align: right; } aligns text to the right .floatleft { float: left; } if applied to an image, it will "stick" to the left and text will wrap around it
The HTML code could then look like this instead: <p class="textright"> <img class="floatleft" ... />
- Error: there is no attribute "name"
The name attribute is often used in forms. It has been phased out and the id attribute should be used instead. This may require some modification of your JavaScript, as well. - Wrong: <form name="loginform">
- Right: <form id="loginform">
- Error: there is no attribute "height" (or "width" or "background")
This error mainly applies to tables and table cells. Tables can have widths, but cannot have heights. Table cells can't have either widths or heights. Neither can use the background attribute to define a background image. (Interestingly enough, the bgcolor attribute is allowed if you want to define a background color.) However, you can use an inline style to set widths, heights, and background images of table cells. It is definitely not recommended to ever set the height for a table. - Wrong:
<table width="100" height="200"> <tr> <td width="100" height="100" background="image.gif">Table cell</td> </tr> </table> - Right:
<table width="100"> <tr> <td style="width: 100px; height: 100px; background: (‘image.gif');">Table cell</td> </tr> </table>
- Recommended: Use style sheets and custom classes to format background colors, background images, and other attributes of tables and table cells.
Right now, it's okay to use cellpadding, cellspacing, border, and bgcolor to format tables. However, you can do all the same things using style sheets, so you might as well learn how to do it now so that it's easier to make modifications in the future. Here's how: - Instead of:
<table border="1" bordercolor="#ff0000" cellspacing="0" cellpadding="3"> - Do this:
Add these custom classes in the style sheet: .redborder { border: solid 1px #ff0000; border-spacing: 0px; border-collapse: collapse; } This replaces cellspacing and borders. .redborder td { padding: 3px; } This replaces cellpadding.
Then in the HTML code, just add a class to the table: <table class="redborder"> Doesn't that look cleaner?
- Error: there is no element "font"
The <font> tag, which has often been used to define text colors, fonts, and more, is officially Out. All text formatting should be done using a style sheet. Below is one example, but understanding how to create style sheets will be covered in a separate article. - Instead of:
<p><font color="#fff000">This is red text!</font> This is not.</p> - Do this:
Style sheet code has a custom class: .red { color: #ff0000;"}
HTML code references the class: <p><span class="red">This is red text!</span> This is not.</p>
- This, by the way, means that you shouldn't use FrontPage's font-formatting toolbar for picking fonts, font sizes, or colors, as FrontPage 2003 and below will automatically generate <font> tags. Instead, use CSS and create a custom class with the properties that you're looking for. (Expression Web will create a custom class for you when you use the formatting toolbar, so that's okay!)
- This article doesn't cover every single error or problem that you may encounter, but you've certainly learned the basics and most frequent issues when moving to XHTML Strict!
Good News for the FutureThe good news is that Expression Web will generally help you create XHTML Strict code, so it's the fixing of old sites that can be time-consuming. If you accidentally leave off a closing tag, Expression Web will highlight code so that you realize that something is wrong. (Unfortunately it isn't always smart enough to highlight the right code, but the problem will usually be nearby.) Also, if you use an attribute that doesn't exist anymore in XHTML Strict, Expression Web will put a squiggly red underline, similar to how Microsoft Word points out spelling problems. PixelMill has many templates that have been pre-validated to be XHTML 1.0 Strict (at least the base layout part!). Applying what you've learned from this article should help you to keep them validated! In the next article, I'll explain what it means to have semantic markup, and why it's a good thing. Next:
Transitioning to Semantic Markup Articles in this series- Transitioning to XHTML Strict in 10 Steps [this article]
-
Transitioning to Semantic Markup
-
Transitioning to CSS
-
Learning CSS
-
Transitioning to Accessibility
-
Transitioning to Expression Web
Related Links
|