HTML5 Data Input Validation

Tim Kulp | December 10, 2012

One of the common challenges that developers have faced since the early days of HTML4 is how to validate the quality of data coming into their applications. The web has changed dramatically since then, and simple text input boxes from HTML4 have evolved in HTML5 to enable content-specific input such as telephone numbers, calendars and color pickers. Functionality that once required JavaScript—like making a field required—can now be embedded into the HTML markup you write. In this article, I’ll use the new form functionality in HTML5 to build a hotel reservation system that relies on content-specific fields and the new validation attributes. By the end of this article, you’ll understand how to constrain and validate input using the new features of HTML5.

Forms in HTML 5

As HTML has transformed from one-way data presentation to an interactive tool, the way data is captured has changed as well. When you think of forms in HTML4, the input, textarea and select tags come to mind. While these tags have not changed, what the tags can do has dramatically been updated to handle the complex data interactions performed by modern web users. Two significant changes have occurred in forms:

  • New content specific types: An empty text box is a great tool to have, but sometimes you know what should be entered in that text box and want to constrain input to only that type of data. HTML5’s new input types (the MSDN documentation refers to them as states) allow you to specify content such as dates, telephone numbers, email address and a few others.
  • New input attributes: Using new attributes like required, min, max and novalidate, you can control how a form is processed and what is acceptable per input element.

Using these new types and attributes, you can build input validation into the markup of your HTML5 application, which allows the browser to handle simple scenarios such as ensuring that a field has a value. If you need to build custom validation scenarios (like comparing multiple controls to determine a valid value) or prefer the control of a DIY solution, you still need to use JavaScript. Regardless of your approach, always duplicate your validation efforts on any data that goes to the server through your server-side code.

Our Sample Application: Contoso Hotel Finder

The best way to get a handle on all these changes is to build a sample application that implements some of the new features. As an example, I’ll demonstrate the hotel search form for Contoso Hotel Finder, an e-commerce travel site. The application captures a guest’s name, email address, phone number, check-in and check-out dates, how many guests will be staying and the person’s rewards number. I’ll begin with a basic fieldset with an ordered list of labels and input tags. Figure 1 shows the HTML we start with.

<form id=hotel-search>
  <fieldset>
    <legend>Hotel Search</legend>
      <ol>
        <li>
          <label for=name>Name</label>
          <input id=name name=name type=text />
        </li>
        <li>
          <label for=email>Email</label>
          <input id=email name=email type=text />
        </li>
        <li>
          <label for=phone>Phone</label>
          <input id=phone name=phone type=text />
        </li>
        <li>
          <label for=checkin>Checkin Date</label>
          <input id=checkin name=checkin type=text />
        </li>
        <li>
          <label for=checkout>Checkout Date</label>
          <input id=checkout name=checkout type=text />
        </li>
        <li>
          <label for=numOfGuests>Number of Guests</label>
          <input id=numOfGuests name=numOfGuests type=text value=0  />
        </li>
        <li>
          <label for=rewards>Rewards Number</label>
          <input id=rewards name=rewards type=text /> 
        </li>
        <li>
          <input type=submit value=Search />
        </li>
      </ol>
  </fieldset>
</form>

Figure 1 HTML for the Basic Hotel Finder Form

Using the Required and Pattern Attributes

The first step is to determine which fields require input before the user can proceed. For the search application, we need check-in, check-out and number of guests. The rest of the data can be captured later in the hotel reservation process. To make the fields required, all you need to do is add the required attribute to the input tag. The required attribute is like the read-only attribute in that it does not need a value; it just needs to be present to be enabled. Here is the check-in input tag with the required attribute added:

<input id=checkin name=checkin type=text required />

Because the tag is marked required, if a user does not provide a value (the value for check-in is left blank), the browser prevents the form’s submission and directs the user to provide a value. When the required attribute is present and the user does not provide data prior to submission, the browser also displays a message stating that the first required field encountered without a value requires a value. The required attribute can be applied to many of the input types as well as to the textarea and select objects.

Another new attribute that is extremely handy is the pattern attribute, which defines what is “good data” for the input field. Pattern defines a regular expression to which input must conform otherwise the value is invalid. For the rewards program number, we can define a regular expression that requires two uppercase letters and then nine numbers.

<input id=rewards name=rewards type=text pattern="^[A-Z]{2}\d{9}$" />

Regular expressions have long been the weapon of choice for listing what is acceptable input. Using the pattern attribute allows you to define the format that data must be in when it’s entered. This is useful for input types like tel and email, which allow any valid format without restricting data to a specific format. As an example, if you want to capture telephone numbers as (###) ###-####, you can use the pattern attribute to specify that exact format. In this example, ###-###-#### or ########## would be unacceptable.

Constraining Data Using Input Types

We are capturing specific data types in this application. We might cook up a series of regular expressions to validate what is entered against what we expect, or leave it to the browser and simplify what we need to code. Let’s take the email input element and change the type to be email:

<input id=email name=email type=email 
  placeholder="example@domain.com" />

By setting the type to email, we tell the input tag that the data entered will be an email address and must conform to the format of a valid email address. As when a user fails to provide data for a tag that includes the required attribute, an invalid email address will give the input tag focus and prompt the user to change the entry to a valid email format. Notice that the placeholder attribute is also included as a visual cue to the user about what our system is expecting. Placeholder is a new HTML5 input attribute that works like a watermark, displaying in this case “example@domain.com” as text in the input field.

Here are other input types implemented to constrain what data can be entered in the input tags:

Telephone

Ensures the user enters a valid phone number. The telephone type does not force users to enter data in a specific format because of the many formats for domestic and international phone numbers. In the example, I also use the pattern attribute (described earlier) to limit what the user enters to a specific format.

<input id=phone name=phone type=tel 
  placeholder="Eg. (410) 555-5555" pattern="^\(\d{3}\) \d{3}-\d{4}$" />

Number

Provides a numeric value that, when rendered, allows users to increment or decrement the number with up and down arrow buttons. Number can be any floating point number, so decimals are allowed as well.

<input id=numOfGuests name=numOfGuests 
  type=number value=1 />

Date

Limits data that can be entered into the field to a date value. Date input types also render a calendar control for users to select a date from. Date formats can vary depending on where you are in the world. For markup purposes, dates are written as YYYY-MM-DD (example: 2012-12-31). This is meant to be machine-read only and is translated according to your users’ locality or preferences (according to the W3C draft). With number and date types, you can also specify min and max attributes to keep data within a specified range.

<input id=checkout name=checkout type=date required 
  max="2012-12-31" min="2012-07-01" />

Notice that the min and max attributes are defined in the date format YYYY-MM-DD. This does not mean that the user must enter a value in this format. This is just the common format for markup to understand what you mean. Using these additional attributes allows you to further specify exactly what is good data coming from this input element.

And Many More. . .

There are more types to select from that do not apply to the sample application. Types like URL, range, time and search are available to enhance your interface. While I don’t discuss these here, you can find them over at MSDN with examples of their usage.

Is It All Sunshine and Rainbows?

No, definitely not. These new input types and attributes are a great leap forward in providing input validation through markup, but there are still a few challenges to face. As with anything related to browsers, implementations will vary. Right now, one of the challenges is the number of browsers (or lack thereof) that support this functionality. Over time, adoption will increase, and this issue will fade. Also, developers used to ASP.NET validation controls will miss the ability to define a control once and have the framework implement the validation on the server and client. HTML validation will need to be mirrored on the server and, depending on the implementation per browser, could lead to inconsistent validations between the client and server.

Always detect your browser’s capabilities to see whether markup validation is supported. If it isn’t, prepare to fall back to JavaScript and the more traditional input validation techniques on the client. As I stated at the beginning of this article, never forget to include server-side validation for any data going to the server.

Client-Side Validation, No JavaScript Required

In this article, I examined some of the new attributes and input types in HTML5. Using these new features provides the client-side input validation your application needs while reducing the JavaScript you need to manage. I examined some of the basic functionality, but as you dive deeper into HTML5 validation, you can leverage the CSS classes to check your data and look good while doing it. A rich scripting framework also surrounds HTML5 validation in Internet Explorer 10 that allows developers to dig in and build custom solutions around the new attributes and input types.

From here, you can extend your knowledge of HTML5 validation by flexing your creativity to enhance the example with some of the other types and attribute combinations. Adding an estimated check-in time is a great place to start, or setting valid ranges on check-out to depend on check-in. HTML5 validation controls are not your entire input validation solution, but as you build client-side HTML5 applications, they are a critical element for ensuring that your application has clean, valid data that provides your users with an input-error-tolerant experience.

About the Author

Tim Kulp leads the development team at FrontierMEDEX in Baltimore, Maryland. You can find Tim on his blog or the Twitter feed @seccode, where he talks code, security and the Baltimore foodie scene.

Find Tim on: