Test Run

Low-Level Web App UI Test Automation

James McCaffrey

Code download available at: TestRun0510.exe (166 KB)
Browse the Code Online

Contents

The Web Application Under Test
The Test Automation
Adapting and Extending the Automation

As Web applications have become more complex, testing them has become more important. There are many testing techniques available to you. For example, in the April 2005 issue of MSDN®Magazine, I describe a simple JScript®-based system that can test a Web app through its UI by using the Internet Explorer Document Object Model. That technique is effective but has several limitations. Some of my colleagues asked me if it was possible to use the .NET Framework to write more powerful but still lightweight automation that tests Web apps through their UI. In this month's column I'll show you how to do just that. The low-level technique involves calling directly into the fmshtml.dll and shdocvw.dll libraries to access and manipulate HTML objects in the client area of Microsoft® Internet Explorer.

So let's begin with a screen shot, Figure 1, which shows that I am testing a dummy Web application that searches a data store of employee information. Users can filter by employee last name or first name. The app displays employee last name, first name, and date of birth for case-sensitive filter substring matches. Manually testing the Web application through its UI would be tedious, inefficient, and error-prone. A better approach is to write test automation. The automation launches an instance of Internet Explorer, attaches to the instance, loads the Web application under test, manipulates the app, and checks the app's state for correctness.

Figure 1 Example Test Run

Figure 1** Example Test Run **

Of course a real Web application will be much more complex, but the techniques I'll show you can be used to test any Web application hosted in Internet Explorer. In the sections that follow I will briefly describe the Web application under test so you'll know what I'm testing and how to test it. I'll also explain in detail the test scenario code that generated the image in Figure 1, and describe how you can adapt and extend the techniques presented here.

The Web Application Under Test

My Web application, WebForm1.aspx, is an ASP.NET app, but the techniques in this column will work with any type of Web application. My application has two radio button controls to tell the app's logic what field to search on, a text input control to accept the user's search term, a button control to initiate the search, and a text area to display results. Below the results text area is an initially hidden label that will display "Search complete." Two of the advantages of the technique I'm presenting are that I don't need to instrument the Web app under test in any way and I don't need to have access to the app's source code. I do, however, need to know the IDs of the various HTML elements on the Web app, but I can get these easily by doing a View | Source (this does require that the IDs be static rather than dynamic). For example, the button control has the ID "Button1" and the Last Name radio button control has the ID "RadioButtonList1_0". Of course, in situations where you do have access to the application's source code, you'll already have this control ID information.

I used Visual Studio® .NET to create the dummy Web application under test. In the Visual Studio .NET design view I added three label controls, a radiobuttonlist control, a textbox control, a button control, and a listbox control. For simplicity I accepted the default control names of Label1, TetBox1, and so forth. The relevant code is listed in Figure 2. I declare an Employee class and an ArrayList object to hold Employee objects. In the Page_Load method I populate the ArrayList with dummy employee data. In a real application your data is likely to come from a SQL Server™ database or XML file, but as far as UI test automation is concerned it doesn't matter where the data comes from.

Figure 2 Code for Web Application Under Test

public class WebForm1 : System.Web.UI.Page
{
    ... // controls declared here

    public class Employee
    {
      public string last;
      public string first;
      public string dob;

      public Employee(string last, string first, string dob)
      {
        this.last = last;
        this.first = first;
        this.dob = dob;
      }
    }

    private ArrayList al = new System.Collections.ArrayList();

    private void Page_Load(object sender, System.EventArgs e)
    { 
      Employee e1 = new Employee("Adams","Terry","01/01/1971");
      Employee e2 = new Employee("Burke","Brian","02/02/1972");
      Employee e3 = new Employee("Ciccu","Alice","03/03/1973");

      al.Add(e1);
      al.Add(e2);
      al.Add(e3);

      Label3.Visible = false;
    }

    ... // Web Form Designer generated code   

    private void Button1_Click(object sender, System.EventArgs e)
    {
      ListBox1.Items.Clear();
      string filter = TextBox1.Text;

      if (RadioButtonList1.SelectedValue == "Last Name")
      {
        foreach (Employee emp in al)
        {
          if (emp.last.IndexOf(filter) >= 0)
            ListBox1.Items.Add(
              emp.last + ", " + emp.first + ", " + emp.dob);
        }
      }
      else if (RadioButtonList1.SelectedValue == "First Name")
      {
        foreach (Employee emp in al)
        {
          if (emp.first.IndexOf(filter) >= 0)
            ListBox1.Items.Add(
              emp.last + ", " + emp.first + ", " + emp.dob);
        }
      }
  
      Label3.Visible = true;
    }
}

The Button1_Click method clears the listbox control, grabs the filter string from the textbox control, checks the RadioButtonList control to see whether to search by last name or first name, searches through the in-memory data store for matches, and displays information on matching employees. Let me strongly emphasize that I am purposely not using good coding techniques here so I can keep the application example short. This also approximates the relatively unrefined character of a pre-release application under test that you'll usually be dealing with. My Web app is obviously artificial, but the essence of testing any Web application through its UI is that the app's state changes with each HTTP request-response pair. In other words, even if the Web application you want to test accesses a SQL Server database or does very complex processing, it still just changes state, which will be reflected in the HTTP response and user interface.

The Test Automation

The test scenario system consists of a single file. I decided to implement my test harness as a C# console application program, but as you'll see I could have used any .NET-compliant language (for example, Visual Basic® .NET) and the technique can be used from just about any type of program (for example, a Windows®-based application) or test framework (for example, NUnit). The overall structure of the scenario is shown in Figure 3. First, I add a reference to the "Microsoft Internet Controls" classic COM component. This is an alias for the shdocvw.dll module that owns the ability to manipulate Windows-based browsers including Internet Explorer and Windows Explorer. Next I add a reference to the Microsoft.mshtml .NET component. This is an alias for the mshtml.dll module that owns the ability to access HTML elements. I add "using" statements for the corresponding two namespaces so I don't have to fully qualify their classes. I also add "using" statements for System.Diagnostics so I can easily refer to its Process class, and for System.Threading so I can refer to the Thread.Sleep method to pause my automation when appropriate.

Figure 3 Test Scenario Structure

using System;
using System.Threading;
using System.Diagnostics;
using SHDocVw;
using mshtml;

namespace RunTest
{
  class Class1
  {
    static AutoResetEvent documentComplete = new AutoResetEvent(false);

    [STAThread]
    static void Main(string[] args)
    {
      try
      {
        // Launch IE
        // Attach InternetExplorer object
        // Establish DocumentComplete event handler
        // Load app under test
        // Manipulate the app
        // Check the app's state
        // Log 'pass' or 'fail'
        // Close IE
      }
      catch(Exception ex)
      {
        Console.WriteLine("Fatal error: " + ex.Message);
      }
    }

    private static void ie_DocumentComplete(object pDisp, ref object URL)
    {
      documentComplete.Set();
    }
  }
}

One of the keys to this technique is the ability to determine exactly when a Web page/document/app is fully loaded in Internet Explorer. I declare a class-scope AutoResetEvent object named documentComplete that I'll use to notify a waiting thread that a document is fully loaded:

static AutoResetEvent documentComplete = new AutoResetEvent(false);

I'll explain this in more detail in a moment. I start my test scenario by printing a status message to the command shell. Then I declare a Boolean variable "pass" and set it to false. I assume that the test scenario will fail, and if the final app state that I check is correct, I fix my assumption and set the pass variable to true. Next I declare an InternetExplorer object named "ie":

Console.WriteLine("\nStart test run");
bool pass = false;
InternetExplorer ie = null;

The InternetExplorer class is defined in the SHDocVw namespace. The class has many methods that manipulate an instance of Internet Explorer, but it's up to you to launch Internet Explorer and attach to it, as shown here:

// launch explorer
Console.WriteLine("\nLaunching an instance of IE");
Process p = Process.Start("iexplore.exe", "about:blank");
if (p == null) throw new Exception("Could not launch IE");
Console.WriteLine("Process handle = " + p.MainWindowHandle.ToString());

// find all active browsers
SHDocVw.ShellWindows allBrowsers = new SHDocVw.ShellWindows();
Console.WriteLine("Number active browsers = " + allBrowsers.Count);
if (allBrowsers.Count == 0) throw new Exception("Cannot find IE");

I use the static Start method in the System.Diagnostics.Process namespace to launch Internet Explorer (iexplore.exe) and load the empty page "about:blank"; Start returns a reference to a Process object for the created process. Next I instantiate a ShellWindows object named allBrowsers. This object holds references to all ShellWindow objects, or browsers, which include instances of Windows Explorer, the new instance of Internet Explorer that my test code just launched, and any previously launched instances of Internet Explorer. I use the Count property to display some diagnostic information about the current number of active browsers and to make sure that Internet Explorer launched successfully. The next phase of my test automation is to attach the new process to the InternetExplorer object:

Console.WriteLine("Attaching to IE");
for(int i=0; i < allBrowsers.Count && ie == null; i++)
{
  InternetExplorer e = (InternetExplorer)allBrowsers.Item(i);
  if (e.HWND == (int)p.MainWindowHandle) ie = e;
}
if (ie == null)  throw new Exception("Failed to attach to IE");

There may be several instances of Internet Explorer running, so I need to determine which one my test scenario launched so I can attach my InternetExplorer variable, ie, to the correct instance. Remember that I captured the test-launched Internet Explorer into a Process object, p. So I iterate through each of the ShellWindows objects checking to see if the handle/pointer matches the main window handle of the test-launched process. An alternative plan I sometimes use is to make the assumption that only my test Internet Explorer instance is allowed to be running. If there is more than one Internet Explorer instance running I throw an exception. This assumption allows me to attach the test Internet Explorer simply with the following line of code:

ie = (InternetExplorer)allBrowsers.Item(0);

Exactly how you deal with this will depend on your particular testing situation. Now that I've established my test InternetExplorer object, I can register the DocumentComplete event handler that I mentioned earlier:

ie.DocumentComplete += new
  DWebBrowserEvents2_DocumentCompleteEventHandler(ie_DocumentComplete);

In essence I'm saying that when the InternetExplorer DocumentComplete event fires, call the user-defined ie_DocumentComplete method. If you refer back to the code listing in Figure 3 you'll see I defined that method as:

private static void ie_DocumentComplete(object pDisp, ref object URL)
{
  documentComplete.Set();
}

The ie_DocumentComplete method invokes the Set method of the AutoResetEvent object I declared earlier in my test class. In short, I now have the ability to pause my thread of execution until my InternetExplorer object is fully loaded. I'll show you exactly how to do that in a moment. Now I navigate to my Web application under test, waiting until the app is fully loaded:

Console.WriteLine("\nNavigating to the Web app");
object missing = Type.Missing;
ie.Navigate("https://localhost/LowLevelWebUIAutomationApp/WebForm1.aspx",
  ref missing, ref missing, ref missing, ref missing);
documentComplete.WaitOne();

I use the InternetExplorer.Navigate method to load my test Web app. Navigate accepts several optional arguments but in this case I don't need any of them. Notice that I call the WaitOne method of the documentComplete object that I prepared earlier. WaitOne will halt my thread of execution until the application is fully loaded into Internet Explorer. In this example I don't supply a timeout value so I could wait forever, but you'll probably want to pass WaitOne an integer value that represents the timeout in milliseconds. Next I set Internet Explorer to a fixed size and get a reference to the Web app's document:

Console.WriteLine("Setting IE to 525x420");
ie.Width = 525;
ie.Height = 420;
HTMLDocument theDoc = (HTMLDocument)ie.Document;

I declare an HTMLDocument variable and assign a value to it. The HTMLDocument interface is defined in the mshtml namespace. So just how did I know this? Figure 4 shows a screen-shot of the Visual Studio .NET object browser. I expanded the mshtml interop assembly to see all of its interfaces, classes, events, and other objects.

Figure 4 Object Browser

Figure 4** Object Browser **

Next I simulate checking the Last Name radio button and typing "urk" into the textbox control:

Console.WriteLine(
   "\nSelecting 'Last Name' radio button");
HTMLInputElement radioButton = 
   (HTMLInputElement)theDoc.getElementById("RadioButtonList1_0");
radioButton.@checked = true;

Console.WriteLine("Setting text box to 'urk'");
HTMLInputElement textBox = 
  (HTMLInputElement)theDoc.getElementById("TextBox1");
textBox.value = "urk";

These two blocks of code are similar and reasonably self-explanatory. I get a reference to an HTMLInputElement object using the getElementById method. After I have the object, I can manipulate it using one of its properties or methods. Here I use the checked property (because "checked" is a reserved word in C# I must use "@checked") for the radiobutton control and the value property of the textbox control. Clicking the Search button follows the same pattern, as you can see here:

Console.WriteLine("Clicking search button");
HTMLInputElement button =
  (HTMLInputElement)theDoc.getElementById("Button1");
button.click();
documentComplete.WaitOne();

In this case I need to invoke the WaitOne method to make sure that the page representing the search results is fully loaded. With a little bit of experimentation you'll find you can manipulate virtually any HTML element. For example, although I don't need to in this test scenario, I could simulate the selection of dropdown controls, the clicking of hyperlinks, and so forth. After I've manipulated the state of the Web application under test I must check the final state for correctness:

Console.WriteLine("\nSeeking 'Burke, Brian' in list box");
HTMLSelectElement selElement =
  (HTMLSelectElement)theDoc.getElementsByTagName(
    "select").item(0, null);
if (selElement.innerText.ToString().IndexOf("Burke, Brian") >= 0)
{
  Console.WriteLine("Found target string");
  pass = true;
}
else
{
  Console.WriteLine("*Target string not found*");
}

The general pattern is to get a reference to a collection of HTML elements by their common tag name, then get a specific element using the item property, and then get that element's innerText, which represents the string between the begin and end tags. Here I get a reference to all of the <select> elements, then use the collection's item property to get just the first <select> element, which is the only one on my Web page. This is the HTML generated by an ASP.NET ListBox control. The parameters to the item property are a bit tricky. The first parameter can be either an integer, in which case it's interpreted as a 0-based index value, or a string, in which case it's interpreted as the tag name. I pass null to the second parameter of the item property. This parameter is also an index value but is only used when the item property returns a collection instead of an atomic object. Sometimes you need to access values on the document body that are not part of any child HTML element. The following code snippet shows one way that you can do that:

Console.WriteLine("Seeking 'Search complete' in body");
HTMLBody body = (HTMLBody)theDoc.getElementsByTagName(
  "body").item(0, null);
if (body.createTextRange().findText("Search complete", 0, 0) == true)
{
  Console.WriteLine("Found target string");
  pass = true;
} 
else
{
  Console.WriteLine("*Target string not found*");
} 

I get a reference to the document body and use the findText method of the IHTMLTxtRange object returned from createTextRange to search for a target string. The two "0" arguments mean search from the beginning of the range and match partial strings. After launching Internet Explorer, loading the Web application under test, manipulating the app, and checking the application's state, all that remains to be done is to determine a pass or fail result and close Internet Explorer:

if (pass) Console.WriteLine("\nTest result = Pass\n");
else Console.WriteLine("\nTest result = *FAIL*\n");

Console.WriteLine("Closing IE in 3 seconds . . . ");
Thread.Sleep(3000);
ie.Quit();

Console.WriteLine("\nEnd test run");

In this case I simply log my test result to the command shell. You'll probably want to write your test results to a text file, XML file, or SQL Server database.

Adapting and Extending the Automation

The lightweight low-level Web application UI test automation technique I've presented here is available in the code download accompanying this column. You can extend and adapt the technique in several ways. One obvious enhancement is to make the test scenario fully automated. Because the test system creates an .exe file, you can easily schedule it to run without manual interaction, for example using the Windows Task Scheduler. You might also want to send test run result summaries via e-mail, using the System.Web.Mail namespace (or the System.Net.Mail namespace if you're using the .NET Framework 2.0). The test system I've presented here is a single program. This is simple and effective but makes harness reuse difficult. You may want to refactor the essential routines into a .NET class library.

When I discuss a technique in this column, I almost always leave out most error checking, so you should write your own error-checking statements and add more granular exception handling try/catch-finally blocks in your test harness. In addition, all my input arguments have been hardcoded for clarity. You may want to parameterize the script to make it more flexible, or better yet use an XML file to drive the test.

Two nice features of the technique I've presented here are that you don't need to have access to the Web application's source code, and that because you're working at a low level you have full control over the application's UI. With Web application systems growing in complexity, testing your software is more important than ever before. The Web application UI testing as I've described here can play an important part in your product testing effort.

Send your questions and comments for James to  testrun@microsoft.com.

James McCaffrey works for Volt Information Sciences Inc., where he manages technical training for software engineers working at the Microsoft Redmond, Washington campus. He has worked on several Microsoft products including Internet Explorer and MSN Search. James can be reached at jmccaffrey@volt.com or v-jammc@microsoft.com.