Semantic HTML

This topic explains how to use HTML that is semantically correct to develop accessible web applications.

  • Introduction
  • Roles
    • Links
    • Forms
    • Lists
    • Tables
    • Headings
  • Accessible Names
    • Tables
    • Forms
    • Images
    • Links
  • Conclusion

Introduction

As discussed in the Introduction to Web Accessibility topic, accessibility for web applications depends on using the correct element for the job. When you develop applications forWindows, you can write to the API to set the accessibility information on a control, for situations where the deployed browser and assistive technology base does not support Accessible Rich Internet Applications (ARIA). But the HTML developer is limited to the accessibility role information that is built into the element. This aspect of accessibility is read-only for the web developer.

For more information about ARIA, see Introduction to ARIA. For more information about how to work with a variety of browsers and assistive technology in your organization and how to support ARIA, see Accessibility and Legacy Browsers.

Elements' accessibility roles define what they are, such as tables, links, text boxes, or any number of other elements. Without the correct role, the assistive technology user may have no idea what particular elements are, or even that, for example, a link is on the web page.

Element roles are semantic or programmatic representations of what the visual user sees. For a sighted user, blue, underlined text indicates a link and large text starting a section of a document is a heading. This semantic information tells the assistive technology user the same things that the visual indications tell a sighted user: what the application does and how it's organized. The main HTML elements used to provide proper roles are discussed later in this topic.

There is a second, crucial semantic piece of information, the accessible name (accName). While the role tells assistive technology the type of the element, the name identifies particular elements. So, a check box that has the associated text of "show details" tells the user two things: (1) that it is a check box (role), and (2) that checking it causes the application to "show details" (name). The role is read-only and determined by the HTML element, but the name must be explicitly provided by the user. The name is often, but not always, the same as the displayed text on a control.

Roles

The tasks that HTML elements are used for are predefined in their definitions, so correctly specifying roles is dependent upon using the appropriate element for the task. However, there are some elements, such as interactive elements like forms and links, and some structural elements, such as lists, tables, and headings, for which specifying the correct role is crucial.

It's important to emphasize that it is the type of HTML element that matters, not its visual appearance. When using Cascading Style Sheets (CSS) and script, it is possible to make a generic container element, such as a span, look and act just like an HTML a href link.

But no matter how much CSS and script is written to make the span look and respond to mouse and keyboard events like a link, it's still a span, and as it's reported as a span to the accessibility API and assistive technology devices. Assistive technology users will not know it's a link. The correct core semantic HTML element must be used regardless of how the changes toCSS and script make the element appear and act, or the correct semantics can be added by using ARIA where it is available.

Another benefit of using semantic HTML is that the element provides the appropriate intrinsic display characteristics without any additional markup. A link specified with the a href element automatically gets the look and feel of a link, as well as its behaviors such as hover, active, and visited states, so that the necessary programming is minimized. Also, the element gets the correct tab stop and tab order with no additional work required.

Semantic HTML lends itself much better to automation, such as for testing. For example, if all subheadings in the document are specified by using h2 and h3 elements, the developer just has to query the document for these elements or tag names—such as with document.getElementsByTagName("h2")—to get nothing other than subheadings.

Finally, using semantic HTML usually results in simpler HTML and smaller file sizes. The elements require fewer CSS class names used to differentiate between classes and fewer CSS rules and attributes, because the intrinsic display of the element provides most of the required functionality.

Let's now look at role information for the most important elements.

A common interactive element in web applications, the link provides users with the ability to navigate from one page to another by clicking text or an image. Links must be specified by using the a href element, and not any other, such as span.

Figure 1: A link, and a span styled and scripted to look and behave like a link

Although the two "links" look and behave the same way for a mouse user, the second is not actually a link and will not work for a keyboard or screen reader user.

<p><a href="#" onclick="OnLinkClick()">I am a link</a></p>
<p><span class="link" tabindex="0" onclick="OnLinkClick()">I am a span</span></p>

For a working sample, see the Links Sample.

Forms

HTML forms are ubiquitous across web applications and are the primary means by which users interact with the application data. They range from a single text box and Submit button, to complex data entry forms with drop-down lists, check boxes, radio buttons, and multi-line text boxes.

Using semantic HTML for forms is crucial for allowing assistive technology users to interact with web applications. Each of the form elements has important accessibility information that is reported to assistive technology devices, and the form should always be specified with the form element. This wraps the form's elements in a single unit, which itself has intrinsic submission and keyboard behavior.

Single-line text boxes should be input type="text", and multi-line text boxes should be textarea. Radio buttons are input type="radio", and check boxes are input type="checkbox". Drop-down lists use the select element.

Forms should also always have a Submit button, input type="submit", and a Reset button is often useful for setting the form element values back to their default state, input type="reset". Finally, there is a grouping element for forms, fieldset, which may be used to group related elements, such as a set of radio buttons.

<form>
<div>
  <label for="urg">Urgency</label>
  <select id="urg"><option>Not Urgent</option><option>Important</option><option>Critical</option></select>
</div>
<div>
  <input type="checkbox" id="spell" onclick="ToggleTest(this);" /><label for="spell">Spell Checker?</label>
</div>
<fieldset>
  <legend>Color Theme</legend>
  <input type="radio" name="theme" id="theme_red" checked="checked" /><label for="theme_red">Red</label>
  <input type="radio" name="theme" id="theme_white" /><label for="theme_white">White</label>
  <input type="radio" name="theme" id="theme_blue" /><label for="theme_blue">Blue</label>
</fieldset>
<div><label for="special">Special Instructions</label><input type="text" id="special"/></div>
<div><input type="submit" value="Submit" /> <input type="reset" /></div>
</form>

For a working sample, see the Forms Sample.

Lists

Lists are fairly common and quite useful HTML groupings. While one might usually think of lists in terms of bulleted lists or numbered lists, the most common definition of a list from a semantic perspective is any set of grouped, hierarchically-related pieces of information. With this definition, a toolbar may be a list of links, a menu can be a list that has (usually single) sublists, and a tree is a list that has any number and depth of sublists.

For a numbered list, use the ol (ordered list) element. For unnumbered lists, the ul (unordered list) element provides the appropriate role, as it both groups all the related elements together semantically, and also reports information about the depth and relative position of any item in that list.

Many assistive technology devices also provide special list navigation features that allow users to quickly move around a list, while keeping track of where they are, just as visual users would.

HTML

The HTML for the tool bar introduces the model used in all list-based examples, a list element with a few child list items, each containing a link. This is a sound semantic approach to all flat or nested hierarchical groups of links, be they in a single list (toolbar), in popup child lists (menus), or in expandable lists (trees).

There is one WAI-ARIA role and one attribute for a toolbar. The toolbar role identifies the ul element as a toolbar to assistive technology, and the aria-label attribute gives a display name ("Navigation Bar", in this case) to the toolbar for assistive technology devices.

<ul class="toolbar" role="toolbar" aria-label="Navigation Bar"
  onkeydown="OnToolbarKeydown(event);" onclick="OnToolbarClick(event);">
  <li class="first"><a href="File.htm">File</a></li>
  <li><a href="View.htm">View</a></li>
  <li><a href="Settings.htm">Settings</a></li>
</ul>

CSS

The CSS for the toolbar turns off the list element's padding, margin, and list-style so that there are no bullets. The list item elements are floated left so that they appear horizontally rather than in the default vertical list layout. The display property on the links is set to "block" so that the entire toolbar item is hot.

ul.toolbar {
  margin: 0;
  padding: 0;
  list-style-type: none;
}
ul.toolbar li {
  float: left;
  border: 1px solid;
  border-left: none;
  background-color: menu;
  color: menutext;
}
ul.toolbar li.first {
  border-left: 1px solid gray;
}
ul.toolbar li a {
  color: black;
  text-decoration: none;
  width: 100%;
  height: 100%;
  padding: 0.2em 2em 0.2em 0.5em;
  width: 8em;
  display: block;
}
ul.toolbar a:hover {
  text-decoration: underline;
}
ul.toolbar a:focus {
  color: highlighttext;
  background-color: highlight;
}

For a working sample, see the Tool Bar Sample.

The Accessibility and Legacy Browsers article includes additional examples of lists used for grouping.

Tables

Tables differ from lists in that there are multiple columns, and the data in each column is of the same data type and description. For lists of data with no column format information, an HTML ol or ul list should be used.

When there is column format data, the table element should be used with headings, whether column or row, identified using the th element. Table cells should be identified using the td element. Many assistive technology devices provide special table navigation functionality, allowing users to move about the table cells, with header information available for every cell.

<table>
<caption>Employee Information</caption>
<thead>
  <tr><th>Name</th><th>Hire Date</th></tr>
</thead>
<tbody>
  <tr><td>John Smith</td><td>09 Nov 2007</td></tr>
  <tr><td>Jane Doe</td><td>10 Nov 2007</td></tr>
</tbody>
</table>

For a working sample, see the Tables Sample.

Headings

Headings demark and identify sections of a document or application. For the sighted user, headings are a critical means of quickly scanning a document's main sections, perhaps locating a specific section of interest, and jumping to it quickly, without having to read the entire document.

HTML provides the h1 to h6 elements, each number indicating a level of hierarchy. Just as visual users have visual cues such as font size and color to identify headings, assistive technology users rely on the semantic HTML to do the same thing.

Many assistive technology devices provide a list of navigable headings, in either a tree or flat format, so that users can quickly scan the document or application and jump to a section. Semantic headings are so useful that many search engines use them to list results.

For a working sample, see the Headings Sample.

HTML

<h1 role="heading" aria-level="1">Widget Project Proposal</h1>
<h2 role="heading" aria-level="2">Contoso Corporation</h2>
<h3 role="heading" aria-level="3">Project Summary</h3>
<h3 role="heading" aria-level="3">Delivery Schedule</h3>

CSS

div.demo h1 {
  background-color: silver;  
  color: gray;
  padding: 0.3em;
}
div.demo h2 {
  text-align: right;
  color: silver;
  background-color: gray;
  padding: 0.3em;
}
div.demo h3 {
  color: black;
  text-transform: uppercase;
  background-color: silver;
  padding: 0.3em;
}

Accessible Names

As mentioned previously, unlike the role, the accessible name is explicitly specified by the developer. While the role identifies what the element does, the name specifies which individual element is being referenced.

Missing or inappropriate names are one of the biggest sources of accessibility problems. A classic example is a missing name on an image link (a link composed of just an image, no text).

An image with no name provides no meaningful information about the image to assistive technology. Although, for example, the image may visually indicate "Purchase", in the absence of a name, that information is not available to assistive technology users. All they see is a "blank" link (or, in the case of some assistive technology that attempts to provide some information when the name is missing, a file path).

The lack of a name on any element, especially any navigable element, should be considered a top-priority issue.

Before getting into some details, let's summarize how various elements have their name set. Text elements, such as links and headings, just take the inner text of the element itself. Table elements use the caption child element to set the name for the table, and table headers (th) get their name from the inner text.

For images, the alt attribute on the image sets the name. For form elements, there are three types: buttons, input type="button" and button, use the value attribute or the inner text, respectively; fieldset uses the legend child element; all other form elements, check boxes, radio buttons, and text boxes require the label element associated with the form element by the for attribute of the label.

The following sections provide some more specific guidelines for setting the name on a number of elements, along with a few other tips along the way.

Tables

Make sure to use the caption element to give the table a name.  If the design does not call for a visible heading for the table (and it probably should), the caption can be positioned off-screen using CSS. Make sure that you have column (or row, if appropriate) headers, using the th element, whose inner text gives the name to all the cells in that row (or column). An assistive technology user can query this header information from any cell, just as a visual user might look up (or across) to see this. 

It's also best to avoid using empty cells for padding (CSS padding on the cells is a great alternative), to avoid nested tables, and to keep related content in the same table. 

Forms

Make sure that all form elements have names, especially those requiring a label for this. Missing name information on form elements can cause confusion, and many assistive technology devices will default to using the "closest" text to provide the name, which at times gives entirely the wrong name.

As with table captions, if there is no visible text in the design for an element, then the label can be positioned off-screen using CSS. The title attribute can be used if necessary, but see the note about title later in this topic.

Wrapping the form elements in an HTML form tag, along with providing Submit and Reset buttons, gives the form the appropriate default navigation and keyboard support. If possible, avoid using auto-postbacks (submitting a form when a value changes), because this causes problems for a number of classes of assistive technology users.

Images

As mentioned earlier, missing names for images are a common source of accessibility problems. The alt attribute is required on every image specified using the img element, and the alt text should describe the image just as it appears visually. For logo images that have pictures of text in them, the alt text should be the same text as in the image.

Where an image is not contextually important, either because it's purely decorative or because it's in a link with text also, the image may be "hidden" from assistive technology by setting the alt attribute to an empty string ("").

Another issue with images is the use of the CSS background-image property to add images that are not really backgrounds. Windows High Contrast mode and some assistive technology products remove backgrounds, including background-image, to improve readability.

It is important that images that are part of the meaning or functionality of the content, are created with img elements that have appropriate alt attributes, rather than with the CSS background-image property. Examples include images used in links and buttons, icons, figures in a document, or photographs in a newspaper article.

Like other text elements, such as headings, the name for a link comes automatically from its inner text. Because many assistive technology devices present lists of links apart from the document, it's important that links have names that can stand alone and that aren't redundant (such as a list of "details" links). If design requirements are such that "duplicate" link names are called for, the title attribute can be used to disambiguate these.

Where there is a link that includes an image and text, it's important to wrap both in a single a href element to avoid redundancy (imagine having to read the same link twice). If the image is not contextually important to the link and there is text, the alt attribute can be set to the empty string.

A Note on TITLE

There are times when employing the alt attribute is useful, such as when there is no visual text for a form element that would require an alt element to set its name (and positioning the label off-screen is problematic). The title can be used here, but be aware that browsers render the title as a Tool Tip, so it might end up visually obscuring part of the control in question, or being generally undesirable aesthetically.

It's also important to distinguish between alt and title, although earlier versions of Windows Internet Explorer show both as a Tool Tip. The alt attribute provides the accessible name for an image, whereas title provides additional information about an element that is not important enough to display by default. The alt on the img element should not be replaced by title.

Think of alt as a Tool Tip, and an emergency accessible name for form elements when label isn't feasible.

Windows and Frames

Windows (parent documents) and frames get their name from the title element in the head section of the document in the window or frame. Make sure that all HTML documents have an appropriate title set.

Conclusion

The building block of web accessibility is semantic title—using the right element for the job. A web application that is built on good HTML is well on its way to being accessible. Conversely, an application built on inappropriate HTML cannot be made accessible. Start with the right elements, and you're building on a solid foundation.

When planning the development of your application, it is good to think semantically and hierarchically from the start. Some questions to ask include: Where do natural sections and sub-sections (and hence sub-headings) occur? Where do groups of hierarchically-related items appear (use list elements), such as toolbars, menus, trees, and lists? What names are needed for each element and sub-element?

It might be useful to think of your application as a giant tree. Think structurally, and map UI controls to semantically-appropriate HTML. In addition to a solid foundation for accessibility, in letting the intrinsic HTML do the work for you, it is easier to create and use templates, plus you get easier and lighter CSS, better automation support, and lighter HTML.