MSDN Magazine > Issues and Downloads > 2000 > May >  Cutting Edge: Extending HTML with Custom Tags
This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
MIND

Cutting Edge

Extending HTML with Custom Tags

Dino Esposito

H
TML is a markup language that includes a number of tags, each with a predefined behavior. Because of its limitations and the higher deployment costs associated with technologies like ActiveX® controls and Java-language applets, more and more advanced navigational features are being built with a mix of Dynamic HTML (DHTML) and "smart" images. Consequently, complex objects are now being rendered using a sophisticated mixture of simpler tags. As you can imagine, this has a detrimental effect on both reusability and maintenance. Figure 1, for example, shows a Microsoft Excel-style tab strip rendered in HTML. If you think it�s cool and want to incorporate it into your own applications, think again. It makes advanced use of namespaces, script, and XML data islands. It�s definitely not easy to manage.
Figure 1 Microsoft Excel-style Tab Strip
Figure 1 Microsoft Excel-style Tab Strip

      To improve the readability and manageability of complex HTML pages, let�s take a look at the various approaches you can employ if you target Microsoft Internet Explorer 5.0.

Web Componentization

      XML and HTML are complementary technologies; XML is normally used for data services, while HTML is used for display and layout. The big difference between XML and HTML is that HTML has a predefined and fixed vocabulary of tags, each with a precise meaning to the browser. The <table> tag, for example, always defines a collection of items displayed within a square grid.
      To add special functions to HTML pages you have two basic options: scriptlets written in DHTML or behaviors. Scriptlets have a larger browser audience since Internet Explorer 4.0 supports them (behaviors were introduced with Internet Explorer 5.0). Scriptlets are overlaid pages that remain a separate entity from the main HTML document. They don�t display as part of the page and have difficulty inheriting all the settings of the host document. You should think of them as a sort of lightweight ActiveX control when you need a self-contained object to plug into your main page.
      While scriptlets could be used to provide improved versions of standard tags, they�re a poor solution since they force you to change your programming style and adopt unnatural approaches. For example, if you want an image element that automatically switches to a different picture when the mouse passes over it, you need to treat it as an object with properties and methods. You have to replace the familiar <img> tag with an <object> tag and you still need to aggregate the <img> to the new element. Furthermore, you have to invoke a method to control its visibility. Therefore, scriptlets can be a rather inelegant approach to building better HTML.
      Internet Explorer 5.0 behaviors, on the other hand, were introduced to help customize and subclass the behavior of tags you�re already familiar with. They were designed precisely to work around the scriptlet limitations I just mentioned. (See http://msdn.microsoft.com/library/officedev/odeopg/deovrunderstandingscriptletsbehaviors.htm for more information about DHTML scriptlets and behaviors.)
      By definition, the behavior is the code that Internet Explorer runs when it processes an HTML tag. The notion of behavior applies to all the tags in an HTML file, whether they�re standard or custom tags. Internet Explorer 5.0 assigns a default behavior to all standard HTML tags, but also supports a new Cascading Style Sheets (CSS) attribute called behavior.
      The architecture behind behaviors is general and has a broad impact on the things you can do with HTML. Internet Explorer 5.0 is capable of handling the standard set of HTML tags, plus all the new tags for which the programmer provides a behavior. The following is a working HTML code that simply outputs the string "Hello, world":
<html>
<body>
    <mytag>Hello, world</mytag>
</body>
</html>
This happens because Internet Explorer doesn�t know how to handle the <mytag> tag. To avoid an incorrect interpretation, it simply ignores it, displaying only the text.
      How can you define a new behavior? Microsoft proposed a COM-based programming interface to define what an HTML tag behavior is supposed to do. As a result, you can write special components to extend or modify the default action Internet Explorer takes when it encounters a certain tag. The new behavior is assigned to tags through the familiar CSS syntax.
<style>
  .customStyle  {
      behavior: url(behaviorCode.htc)
  }
</style>
This style allows programmers to customize the way in which a given HTML tag works. To apply a behavior that�s defined in an HTML Component (HTC) file, you just have to assign the name of the CSS class that contains the behavior to an HTML tag, like so:
<span class="customStyle">Hello, world</span>
      Scriptlets are a good solution if you want to implement self-contained, standalone components, and behaviors look great when you need to extend or modify a standard tag�s behavior. But if you need both, a good option would be to use new HTML-based elements created through the combined use of existing tags.
      The Internet Explorer 5.0 behavior architecture described earlier provides this for free. You just have to define a custom HTML tag and bind it to a behavior. From that point, Internet Explorer 5.0 treats a standard HTML tag and a custom tag in the same way: both have a name and an associated behavior. Defining new custom HTML tags is an excellent way of grouping together pieces of script and HTML code, resulting in a more modular design and far more readable code.

Custom Tags and Properties

      Custom tags implemented through behaviors can be considered the HTML counterpart of subroutinesâ€"that is, a block of HTML code is referenced by name. This feature, in conjunction with the HTML custom properties introduced with Internet Explorer 4.0, provides maximum flexibility when it comes to writing HTML pages.
      A custom property is simply a nonstandard attribute that you set for a certain tagâ€"whether in the source code of the page or through a script that accesses the DHTML object model. In the following example, myprop is a property defined on the fly.
<span myprop="I'm a custom property">
   Hello, world
</span>
      Internet Explorer 5.0 can tell the difference between defined custom tags and unknown tags. Both are markup tags (they�re ignored as the document is parsed). However, Internet Explorer 5.0 support for HTML custom tags requires that a namespace be defined for the tag. You must define your own namespace to hold your custom tags. If the tag is not part of a namespace, it�s considered an unknown tag and is ignored by the browser. <mytag> is not part of any namespace, nor is it part of the HTML vocabulary. Because it�s an unknown tag, it can�t be assigned a behavior.
      To comprehend the importance of custom tags within an HTML page, let�s consider a very simple feature you might want to use in your pages: framed text. The only HTML tags that allow you to draw a frame around them are <img> and <table>. Unfortunately, an <img> tag cannot display text. Other tags such as <span> and <div> do not natively support the border attribute. If you must frame the text, you can embed it in a minimal <table> element with just one row and one column.
<table border="1" 
       cellspacing="0" 
       cellpadding="2" 
       bordercolor="#000000">
<tr><td>
Hello, World
</td></tr>
</table>
      Instead of using all the code that�s wrapped by the two <table> tags, why not create a single, nonstandard HTML tag that renders framed text? You could create it on the fly through script code and the DHTML object model. You could also create a scriptlet. However, the best option is to use custom tags with an applied behavior.
<HTML xmlns:dino>
<HEAD>
<STYLE>
   @media all {
     dino\:frame {
       behavior:url(frame.htc);}
   }
</STYLE>
</HEAD>
<BODY>
<dino:frame border="2">
<b>Expoware</b> Soft
</dino:frame>
</BODY>
</HTML>
      The previous code snippet creates a new <frame> tag as a part of the dino namespace. The behavior of this tag is defined in the frame.htc component. Basically, frame.htc will simply read the <frame> tag and replace it with the <table>�</table> code block I examined earlier. When you look at the source code, however, it looks significantly clearer than before. Figure 2 shows the code in action, while Figure 3 shows the source code for this simple behavior.

Figure 2 frame Tag in Action
Figure 2 <frame> Tag in Action

      Notice that a behavior doesn�t necessarily need to be a piece of script code isolated in an HTC file. The behavior of a custom HTML tag can be more simply defined with a few CSS styles. For example, if you�re writing a structured page where you often need to apply certain formatting styles, you could do this more easily through a custom tag. The code in Figure 4 defines a tag called Listing within the dino namespace. Figure 5 shows the resulting page.

Figure 5 Listing Tag in Action
Figure 5 Listing Tag in Action

Define a Namespace

      The Internet Explorer 5.0 namespace declaration mechanism derives from the W3C XML namespace specification (see http://www.w3.org/tr/rec-xml-names). It requires you to insert the xmlns attribute within the <html> tag. Note, however, that Internet Explorer 5.0 supports namespace declarations only within the <html> tag, while the XML namespace specification allows it for any tag.
      The syntax of the xmlns attribute takes the following form:
<html xmlns:prefix[=urn]>
You could omit the final portion of the prototype specifying the Uniform Resource Name (URN). Recall that URN is a string that uniquely identifies the namespace. The prefix placeholder in the previous declaration should be replaced with the string you use to prefix any tag that belongs to your namespace. For example, I can declare the dino namespace with the tag:
<HTML xmlns:dino>
      You probably noticed a mysterious presence in the examples I�ve mentioned so far:
@media all {�}
There�s nothing magical behind this code. The media directive simply defines the media types for a set of stylesheet rules. In fact, you could have different styles applying to different output devices such as the screen or a printer. By writing
@media all {
  dino\:frame {
    behavior:url(frame.htc);}
}
you�re telling Internet Explorer to apply the following stylesheet rules to every output device.
@media screen {...}
@media print {...}
At the moment, Internet Explorer supports two others in addition to the @media setting: screen and printer.

Custom Tags as HTML Subroutines

      You might be convinced by now that using custom HTML tags with Internet Explorer 5.0 is not only possible, but incredibly cool. To reinforce this idea, let me show you a couple of examples where there�s an obvious similarity between custom tags and subroutines, as well as between namespaces and libraries.
       Figure 6 shows a screen shot from an MSDN™ Library CD. The titles of the sections are rendered as rounded tabs, and the captions store hyperlinks to documents that expand the information shown below it. Several months ago, I was asked to lay out a similar document. It had to highlight documents that a user had clicked to mark for later retrieval and display. I called this HTML entity msdnTab and defined it as the union of a caption bar, a hyperlink to an external file, and a separating line, plus some descriptive text.

Figure 6 MSDN Library CD
Figure 6 MSDN Library CD

      Before starting my own coding, I took a look at the source code for the MSDN Library CD home page. What you see in Figure 6 is actually implemented through a couple of <table> tags surrounded by GIFs to give it that nice rounded look. The first <table> tag contains the two images and an <a> tag in a blue background. The second table element is two pixels high and covers the entire width of the screen. The description is wrapped by a <div> tag and set with a style that reduces the height of the font by 15 percent. Rather complex, isn�t it? Well, complex or not, it definitely works and it looks great.
      However, the difference between programmers and HTML designers is that programmers always think in terms of reusability and code maintenance. The previous solution is not very flexible or reusable for a couple of reasons. For instance, what if you want to change the font size? In this case, the height of the central block will increase and you�ll need to adapt the GIFs that round it off. More importantly, what if you want to move that code from the MSDN Library CD to reuse it in other applications? Where exactly does the code start and end?
      Once I was sure I understood the layout of the solution, I simply tried to rebuild it in my own pages. It didn�t work. In particular, I was unable to create a colored line to underline the tab. Even cutting and pasting the original source code was tricky, because it was hard to extract the right piece of code and reproduce the behavior on a different page.
      In modifying the code, I decided to remove the rounded corners and, therefore, the GIFs. The HTML source code for my caption bar is shown in Figure 7, while the result is shown in Figure 8.

Figure 8 Caption Bar in Action
Figure 8 Caption Bar in Action

Notice that the line is obtained through a <td> tag that spans the available screen width. To maintain the height specified, say at two pixels, setting the style attribute

Height:2 
is not enough. In fact, even setting a height in the cell itself
<td height="2"></td>
doesn�t work because the row has a default height determined by the current font. This height takes precedence over the stylesheet. In order to work around this, I used an empty <img> tag as the content of the <td> tag. This trick is also used in the MSDN Library CD source code.
<td width="100%" height="1"><img height="1"></td>
      Although the code in Figure 7 is not very long, it�s too long to be reused easily across projects and documents. It needs to be encapsulated in an HTML subroutine so it can be invoked through a custom tag. In Figure 9 you can see the same piece of code at work in a slightly more general context.

Figure 9 Caption Bar Incorporation
Figure 9 Caption Bar Incorporation

The accompanying source code is shown in Figure 10. The tab is now hidden in a new tag called msdntab, part of the DINO namespace.

<DINO:msdntab label="Invoices">
Consult all my invoices.
</DINO:msdntab>
      This little code fragment shields you from all of the HTML code in Figure 7 and is much more readable. But what does the msdntab element really do? The answer is shown in Figure 11, which displays the entire source code for the msdntab behavior. Notice that the component defines a few properties that could be set through attributes, and handles the ondocumentready event to make sure it modifies the page in a safe manner. When this event is fired, the component simply builds an HTML string with the text seen in Figure 7 and replaces the content of the innerHTML property of the element.
element.innerHTML = htmlText;
In doing so, the element called msdntab is replaced on the fly by the combination of tables and empty images described previously.
      With a little more effort, you can create a new tag for a line of a given thickness and color. I�ve done this and called it msdnline (Figure 11 shows its implementation). These custom tags work just like standard tagsâ€"you you can combine them with any other standard HTML tag. For example, you can use them to populate the cells of a table, or put them in a <div> area that you toggle on and off.

The Tabstrip Component

      Suppose you want to create a tabstrip component that renders HTML that looks like a Microsoft Excel workbook with a number of overlapping pages to choose from. Each page can be selected by clicking a tab, and each tabstrip tag lets you set name and hyperlink attributes that refer to the associated page.
      If you want to do this in pure HTML, you�ll end up doing a huge amount of coding. However, if you think of a tabstrip as a self-contained component that could be easily wrapped by a scriptlet, and remember that a scriptlet is just an <object> tag within the host page, the task gets a bit easier. All the parameters you need to customizeâ€"the URL of the linked pages, the captions of the tabs, and so onâ€"must be set via script code. A scriptlet might save you from having to write complex HTML code, but the price is lots of inscrutable script code surrounded by anonymous <object> tags. Compare that to the following code snippet:
<dino:tabstrip>
    <tabitem title="One"     src="one.htm" />    
    <tabitem title="Two"     src="two.asp" />    
    <tabitem title="Three"   src="three.asp?p=1" />    
</dino:tabstrip>
      Figure 12 shows what this tabstrip will look like. Each item named dino:tabstrip causes a collection of child tabs to appear on the top of the page or within the page cell where they are located. By selecting one of them, you switch to its referred URL, which is an HTML or ASP page. The tabstrip is simply a <table> with a single row and one column for each tabitem.

Figure 12 Tabstrip Example
Figure 12 Tabstrip Example

      The behavior working behind the scenes of the dino: tabstrip custom tag creates the table on the fly as soon as Internet Explorer notifies it of the ondocumentready event. The table handles a few events on its own, saving you from doing it yourself in your page scripts. In particular, it is sensitive to onclick, onmouseover, and onmouseout events. The onclick event switches the page displayed below it. The latter two events add a bit of animation to the otherwise lifeless component by highlighting the tab�s caption when the mouse moves over it.
      The tabstrip is a two-part element. An <iframe> tag embeds a child page below the table. During the onclick event, the source page for the frame is updated to point to the page associated with the selected tab. Figure 13 shows the full source code for the component. Don�t forget to insert the following declaration for your custom tags within the <style> tag.

@media all
    {dino\:tabstrip {behavior:url(tabstrip.htc);}}
      The tabstrip component allows you to specify plenty of UI settings such as tab colors, font style, and margins. It also allows you to set a given tab as the default, and disable any of them. Here�s a more advanced example; its output is shown in Figure 14.

Figure 14 Complex Tabstrip Usage
Figure 14 Complex Tabstrip Usage

<dino:tabstrip label="Dino's <b>Tabstrip</b>">
  <tabitem title="One or more" src="one.htm" />    
  <tabitem title="Two" src="two.htm" selected/>    
  <tabitem title="Three" src="three.htm" grayed/>    
  <tabitem title="Four" src="four.htm"/>    
</dino:tabstrip>
Notice the use of custom attributes such as grayed and selected. The interesting possibility of custom attributes makes the use of custom tags even more attractive, and considerably increases the readability of your HTML code.

Summary

      If your target browser is Internet Explorer 5.0 or higher, you can work with a made-to-measure version of HTML in several ways. You can bind a specific behavior to existing tags to customize or extend them. You can also create new tags that simply group together some common CSS styles and apply them to all the text they wrap. Finally, you can define new tags with an associated behavior. This is the most exciting possibility since it gives you the power to shape the language you�re using in your components. In all of these cases, remember to define your own namespace to hold all the new tags. If you omit this, Internet Explorer will mark your tags as unknown and ignore them altogether.

Dino Esposito is a senior trainer and consultant based in Rome. He has recently written Windows Script Host Programmer�s Reference (WROX, 1999). You can reach Dino at desposito@vb2themax.com.

From the May 2000 issue of MSDN Magazine.

Page view tracker