Windows CE .NET Long-Haul Testing Scenarios and Results
Updated February 2003
Microsoft® Windows® CE .NET with Microsoft Platform Builder 4.0 and later
Summary: Learn about the importance of long-haul testing and its role in increasing overall quality, reliability, and stability of a Windows CE operating system (OS). The Windows CE QA team implemented a number of long-haul testing scenarios during the two previous product cycles and this article shows the goals, setups, and results of those tests. You can use these scenarios as examples for your own testing environment. We are continuously working to create new setups to exercise user scenarios identified by our customers. (10 printed pages)
Long-Haul Testing Scenarios
Residential Gateway Scenario
Externally-Exposed Residential Gateway Scenario
Web Server Scenario
Internet Appliance Scenario
Media Appliance Scenario
Long-haul testing is a method of improving software quality by running a set of tests, which represent common user scenarios, on stand-alone devices for as long as possible. This type of testing routinely uncovers bugs that might be missed by the functional, performance, and stress testing that is the normal part of any software release. Long-haul testing helps to find resource leak, timing, hardware-related and counter-overflow bugs.
Resource Leak Bugs
Long-haul testing is extremely effective in finding resource leaks, such as memory leaks, handle leaks, or critical section leaks, that are not easily found during development testing. For example, even though a 4-byte handle leak is negligible during one test cycle and usually does not affect the operation of a specified task, it can represent a significant memory drain when accumulated over thousands of test repetitions. In today's age of handheld devices and embedded systems, which are always on and tend to have much smaller memory footprints than desktop workstations, any memory leak is unacceptable.
Long-haul testing also finds bugs that depend highly on timing or environment, and are periodic or random. For example, a race condition bug may occur when responses, good or bad, come from two or more servers in just the right sequence, or hardware interrupts occur at the same time. Statistically, you need a large number of test repetitions to hit just the right order of server responses or just the right thread creation or deletion times to reproduce this category of bug. Although these bugs are highly non-deterministic or random, they do tend to occur when tests are run over a long period of time.
Long-haul testing can find hardware-related bugs that occur when network traffic surges or power glitches cause hardware to fail in unexpected ways. Even though testers force failure conditions and try to have test cases for these bugs, they can occur during any code path and are difficult to test for using standard approaches.
Long-haul testing is well-positioned to find counter-overflow bugs. A counter-overflow bug occurs when a tracked value exceeds the size of the variable that is meant to store that value. For example, unsigned, byte-size variables can store up to 256 different values (0-255), so when an event occurs for the 256th time, the variable wraps around and stores a zero value. If this happens, timers would effectively wait forever, or counters would go backward.
To rectify this, a programmer can use a larger variable type, such as WORD, DWORD or even QWORD, or account for the possibility of variable wraparound in his or her code. Counter or timer calculations can also involve mixing signed and unsigned variables, which further complicates troubleshooting and testing. The variables can be declared as unsigned or signed. To find these bugs without using long-haul testing, the tests would have to run for long periods of time.
To proactively catch these types of bugs, we set debug images' initial millisecond counter to –200 seconds, so rollover happens very shortly after the device boots. Given the sheer number of counters and timers in an OS, it is not always possible to target individual ones easily. Long-haul testing is the only available approach.
Long-Haul Testing vs. Stress Testing
Long-haul testing should not be confused with stress testing. Stress testing intentionally puts a device under excessive load while possibly denying it resources to process the requests. The device may not ultimately process the requests, but the device is expected to fail gracefully, without corruption or loss of data. On the other hand, long-haul testing generally tries to approximate statistical usage of an average user against a device with enough resources to satisfy requests, repeated over a long period. While satisfying an individual test scenario or request might take seconds or minutes, long-haul testing retries it repeatedly over days and weeks, making sure that devices remain operable throughout the test period.
For more information about Windows CE testing approaches, see this Microsoft Web site.
A long-haul testing scenario involves running a Windows CE–based embedded device to exercise OS features the way a typical user would during typical use.
We chose a number of real-world scenarios with the intent of matching targeted vertical platform descriptions of the shipping configurations in Platform Builder and adding variations based on possible user setups. Residential gateway devices, embedded Web servers, Internet appliances, and media playback devices emerged as likely devices for testing Windows CE user scenarios. We defined the following long-haul testing scenarios around them.
Long-term stability and survivability have become an important part of acceptance criteria for final sign-off and release of the Windows CE OS, increasing its overall reliability.
A typical residential gateway device provides a shared Internet connection and workgroup services to a networked small office or home. The device can provide services such as file sharing, Web hosting, and Internet connection sharing to the local network of computers depending on the functionalities enabled on the devices. For example, a more advanced residential gateway device provides virtual private network (VPN) connections, authentication services, IPv6 routing, firewall protection, telnet protocol, and File Transfer Protocol (FTP).
The goal was to exercise as many features of the Residential Gateway configuration as possible. We completed one wireless (802.11) long-haul setup, and one wired (100 Mbps) long-haul setup.
The following table shows user actions that the residential gateway scenarios tested. These actions run concurrently across a Windows CE–based residential gateway device and exercise OS features the way a typical user would during typical use.
|File transfer||Servers copying real-world files, such as .txt, .htm, .exe, and .bin, back and forth between them in both directions at the same time. Files range in size from 1 byte to more than 10 MB. This file exchange produces both small and large packets, and both burst and sustained network loads.|
|Data transfer||Clients and servers creating connections, exchanging data, and disconnecting connections through a residential gateway device.|
|Internet radio||Home user listening to a radio station on the Internet through a residential gateway device.|
|Web session||Browsing client, which runs five simultaneous threads with random 2 to 15 second delays between requests, continuously hitting an on-device Web server.|
|FTP session||Users transferring files by using FTP from the corporate network to the home computer.|
|Remote Desktop Protocol (RDP) session||Home user connecting to corporate network.|
Note Not all user actions were enabled for both the wired and wireless setups.
We tested the user actions through a wireless residential gateway device, over an 11 Mbps connection, using a Cisco PCX500 driver.
Figure 1. Wireless residential gateway setup
We tested the user actions through a wired residential gateway device, over a 100 Mbps connection, using a Realtek RTL8139 driver.
Figure 2. Wired residential gateway setup
Residential gateway results are monitored from release to release. The following table shows the test results for the Windows CE .NET 4.1 release.
|Wireless||More than 30 days, more than 600 MB transferred
More than 30 days, more than 1 terabyte transferred
|Wired||More than 60 days, more than 30 terabytes transferred
More than 25 days, 12.8 terabytes transferred
More than 34 days, 17 terabytes transferred
These are not maximum run times. They are, instead, the target times used to allow new images to be tested. The devices did not stop working; we stopped them so that new tests and images could be introduced.
In addition to the two formal residential gateway setups listed previously, we distributed residential gateway devices running Windows CE to team members for home use. The team members repeatedly reported that devices ran uninterrupted for months at a time (some as many as eleven months or longer). Scenarios in a home-based network include Web browsing from a home computer, file, program, and music downloads to a home computer, and remote access into the Microsoft corporate network either from a home computer or a laptop.
This scenario exposes Windows CE–based devices to an environment that resembles a typical home user's network, such as one or two machines connected to the Internet by using a digital subscriber line (DSL) bridge or cable modem. We want to configure and use this test environment as we would recommend our customers configure and use it.
The goal was to expose the device to all the perils of the Internet to allow further debugging and improvement. We attached monitoring mechanisms, so that if a device was compromised, the monitoring mechanisms' data could be used to isolate the problem and correct it before the device was reintroduced.
The externally-exposed residential gateway Web site ran on a Windows CE–based Internet appliance running the built-in Web server. We connected the Internet appliance to the Internet through a Windows CE–based home residential gateway device with Network Address Translation (NAT) enabled.
Both the Internet appliance and the residential gateway with NAT device ran on a Lanner Electronics EM-351 Single Board Computer (SBC).
Figure 3. Externally-exposed residential gateway setup
- Microsoft Internet Explorer has been running for more than 170 days.
- The residential gateway device has been running for more than 170 days.
As expected, these standalone devices experienced much higher run times than the networked devices from previous tests. In the past, we reset the networked devices mostly because of external events like power outages or network reconfigurations, or because we wanted to include the latest available features in our tests for the particular release.
A number of applications in the computer industry that fit the Windows CE profile require a reliable Web server. For example, data collection or industrial controller applications can use Web servers to control devices remotely by using a Web browser. The Web-based sample interface to the Residential Gateway configuration that ships with Platform Builder is an example of this type of device control. Web servers can also collect data and put it online in real time.
The goal was to serve a mixture of test pages that exercise broad Web server functionality. Examples of these test pages were active server pages and .htm files that included .jpeg files, redirection links, server variables, and ISAPI scripts. To simulate real-world usage, a stress client, which was running five simultaneous threads with random delays from two to 15 seconds between requests, continuously hit the Web server. This approximated users randomly browsing a complete Web content hierarchy.
Initially, we defined three setups. Each had different hardware, networking and Web server capabilities.
|Device 1||Intel Pentium III 600 MHz||32 MB||Realtek NE2000 10 Mbps PCMCIA|
|Device 2||Intel Pentium II 233 MHz||32 MB||Realtek NE2000 10 Mbps PCMCIA|
|Device 3||Intel Pentium III 750 MHz||64 MB||Realtek RTL8139 100 Mbps PCI|
Figure 4. Web server setup
The following table shows the results for tests run for the Windows CE .NET 4.0 and 4.1 releases.
|Device 1||24 days, more than 1,000,000 pages served
30 days, more than 702,222 pages served
|Device 2 |
(pages served over HTTPS)
|55 days, more than 1,400,000 pages served
35 days, more than 838,856 pages served
|Device 3||57 days, more than 8,779,234 pages served
24 days, more than 906,286 pages served
9 days, more than 1,244,246 pages served*
9 days, more than 1,210,390 pages served*
*During the early test runs, we experimented with varying server loads and varying numbers of clients accessing the servers simultaneously. We stopped and refreshed images more often, usually after a trigger count was reached, for example when 1,000,000 pages had been served or a continuous run had lasted more than one week. We were able to then front-load and test a wider variety of possible client loads with the limited number of devices we had available. The final settings were then left in place and subsequent recorded runs showed the expected resiliency and reliability of the Web server.
All of these setups were running under the Platform Builder debugging environment, which affected the length of the run in about half of the cases. If the device loses contact with the Platform Builder debugger, the device image stops. This would not affect a production device, which does not have a connection to the debugger.
To address this issue, we added a fourth setup on a standalone device, as well as a standalone CEPC with a 1394 Web camera. We are monitoring these setups for future releases. See this Microsoft Web site for more information on how to set up a 1394 Web camera.
In addition to these Web server long-haul tests, a number of feature teams within Microsoft set up and administered Windows CE–based Web servers. These teams reported continuous runs of their particular configurations for months at a time, and under varying real user loads. We expect to track these results for future releases as well.
The Internet Appliance configuration provides functionality for non-mobile, browser-based, consumer devices. This long-haul testing scenario emphasizes using Microsoft Internet Explorer to browse the Internet for long periods of time.
The client browser program on each Windows CE–based device (see IE1 CEPC21127 and IE2 CEPC9525 in Figure 5) loaded randomly selected Web sites every 30 seconds. The Web sites were among the most heavily visited sites on the Internet. We instructed the browser to crawl links every five seconds after loading a Web site.
We used Microsoft® Windows® CE .NET PC-based hardware development platforms (CEPCs) for the tests. Although we used the following hardware specifications for our CEPCs, the requirements are not stringent—all you need is a Windows CE–based device with network connectivity and a display.
|Network connectivity||Wired, 10 Mbps|
|Processor||Intel 400 MHz|
|Network card||Socket Low Power Ethernet CF Card|
|Video card||Tvia video card (flat driver)|
Figure 5. Internet appliance setup (two devices shown)
During the Windows CE .NET 4.1 release, the longest recorded run was approximately 20 days with more than 50,000 pages loaded. We stopped the test at that time to reload the updated image that contained fixes for detected memory leaks.
The Media Appliance configuration provides functionality for a wide range of devices for which media delivery is the key feature. Media appliances include electronic book readers, electronic picture frames, audio devices, and media storage devices.
The goal was to play back a large number of MP3 files, averaging three minutes each, from a streaming server to a headless device. We played them continuously, one after the other, for as long as possible.
Figure 6. Media Appliance
The following list shows our test results based on the final run on a released Windows CE .NET 4.1 OS image.
Digital Audio Receiver (previous release numbers in parentheses for comparison)
|MP3 files played||6,602||5,797|
|Duration||20 days — 480 hours||17 days — 408 hours|
|Total bytes transferred||24.87 GB||22.44 GB|
|Average bytes/hour||57,461,788 bytes||not applicable|
During each run, statistics such as CPU load, memory use and transfer counts were evaluated weekly and devices stopped for debugging as needed. We estimated that memory leaks detected and fixed were small enough to predict significantly longer runs. These runs were subsequently achieved. For example, the current run is more than 55 days and counting.
Long-haul testing has already proved to be an invaluable tool to increase overall quality of the Windows CE OS. Long-haul tests uncover bugs that would have otherwise been missed by functionality, performance, or intense stress testing that is a normal part of any software release.
The majority of bugs that we uncovered were related to subtle memory leaks that did not show up during regular testing, or were masked by memory activity of other processes or modules in the system. But, once we put sustained scenarios in place and repeated them over longer periods of time, these bugs became easy for our team to discover. We used several tools that are available in Platform Builder, including CELog, lmemdebug, Kernel Tracker, and debug shell commands. Race condition bugs were rare and were highly dependent on timing and one-time events.
During the Windows CE .NET 4.0 and 4.1 test passes, we did not encounter counter-overflow bugs.
Since Windows CE .NET 4.1 was released, we have defined and configured additional feature-specific, long-haul testing scenarios that cover specific technology areas that could not be incorporated into the standard setups above. These long-haul testing scenarios are:
- IPv6 Web server
- Bluetooth data exchange
- Infrared data exchange
- Voice over IP (VoIP) sessions
- Point-to-Point Protocol (PPP) client/server connection and data exchange
- Point-to-Point Tunneling Protocol (PPTP) client/server connection and data exchange
- Point-to-Point Protocol over Ethernet (PPPOE) client/server connection and data exchange
- Remote Desktop Protocol (RDP) to Windows XP client connection and data exchange
- 1394 Web camera
- Shared printer server
- Shared file server
- Message Queuing data exchange
About the Author
Sergio Cherskov earned a BSEE in Telecommunications, with minors in Math and Physics, from University of Split, Croatia. He has more than fifteen years of direct embedded systems programming, design, and testing experience, and has been a QA manager for five years on the Microsoft Windows CE Networking and Security (Windows CE QA) team. He has presented Microsoft testing methodology at the Windows Embedded Developers Conference, and real-time kernel testing at Quality Week.