Test Run

Randomness in Testing

Dr. James McCaffrey

Code download available at:TestRun2006_09.exe(162 KB)

Contents

C reating and using random test case data is an essential software testing skill. Although most test case data consists of specific inputs to the system under test and specific expected values/states, you will almost always want to subject your system to random test case inputs, too. Typically you do this to see if you can cause a crash or evoke an exception by throwing a large variety of inputs at the application. In this month's column, I'll explain four common tasks when dealing with random test case data in a Microsoft® .NET Framework environment:

- Generate pseudo-random numbers (Knuth algorithm)
- Analyze a pattern for randomness (Wald-Wolfowitz test)
- Shuffle a list of items (Fisher-Yates algorithm)
- Generate Gaussian numbers (Box-Muller algorithm)

Let's look at **Figure 1** for an example. The first part of the output shows the results of basic random number generation using the Random object of the .NET Framework. Even though you're probably familiar with this technique, I'll point out how to avoid common pitfalls. The second part of the output shows a very useful, but little-known, technique for analyzing a pattern of arbitrary symbols for randomness. This technique has a wide range of applications in software development in general, not just testing. The third part of **Figure 1** shows the result of shuffling a list of items; this is surprisingly tricky.

Figure 1** Random Techniques Demonstration **

I'll explain in detail why many shuffling implementations appear correct but, in fact, are quite wrong. The last part of the output in **Figure 1** shows the result of generating a set of numbers that are distributed according to the normal bell curve. Apart from being a highly useful technique, the implementation details of this algorithm are interesting in their own right and will be a valuable addition to your personal coding toolbox.

Generating Uniform Random Numbers

The most fundamental task in random test case generation is generating a random number—integer or floating point—in a particular range. This is often done via the System.Random class. Consider this code:

Random objRan = new Random(5); int n = objRan.Next(7); Console.WriteLine("A random int in the range [0,6] is " + n); n = objRan.Next(3, 13); Console.WriteLine("A random int in the range [3,12] is " + n);

I instantiate a Random object, passing in a seed value (5 in this example). The seed value sets the starting point for a sequence of numbers that exhibit many characteristics of true random numbers. Because the sequence is deterministic (the numbers are generated from a mathematical formula that uses as input the seed value or previous numbers in the sequence), the numbers generated by System.Random are technically pseudo-random numbers, but they're typically referred to as random numbers in informal usage or when the context is clear, as here. The seed value I chose is arbitrary. If I use the overloaded Random constructor that does not accept a seed value, a value derived from the system clock is used. If you need to recreate your sequence of random numbers on subsequent test runs, which is usually the case, you should supply a seed value. A discussion of pseudo-random number generator seed values is an important and complex topic and unfortunately outside the scope of this column.

The simplest way to generate a random integer is to call the Random.Next method, passing in a single integer argument. The return value is the next integer in the pseudo-random list, which is greater than or equal to 0 and strictly less than the argument. Therefore, the following call returns a number between 0 and 9 inclusive; not between 0 and 10 inclusive:

An overload of the Random.Next method accepts two integer arguments and returns an integer greater than or equal to the first argument and strictly less than the second argument. If you were simulating test case data that corresponds to the roll of a normal die with six sides, to get a random number between 1 and 6 inclusive the call could look like this:

int roll = objRan.Next(1, 7);

It is easy to generate a randomly selected item from an array:

string[] items = new string[] { "alpha", "beta", "gamma", "delta" }; Console.WriteLine("A random member of { 'alpha', 'beta', " + "'gamma', 'delta' } is " + items[objRan.Next(items.Length)] );

If the size of an array is N, the call objRan.Next(N) will result in a return value that is an integer in the range [0, N-1], which corresponds exactly to the index values of the array. Notice this technique also works for ArrayList objects and, in fact, for any 0-based indexed collection.

Behind the scenes, the overloaded Random.Next methods are using the Knuth pseudo-random number-generating technique. This is also known as the "subtractive technique." Knuth published this algorithm in "Seminumerical Algorithms," Vol. 2 of *The Art of Computer Programming* (Addison-Wesley, 1981). Generating uniform pseudo-random numbers is quite difficult, but luckily the .NET Framework takes care of implementing the Knuth algorithm for you.

About the time the .NET Framework was first being developed, a new pseudo-random number-generating algorithm was discovered by Matsumoto and Nishimura. Their algorithm, commonly known as the Mersenne Twister technique, is quickly replacing older pseudo-random number-generating algorithms because of its generally superior performance and mathematical characteristics. See "Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator," *ACM Transactions on Modeling and Computer Simulation*, Vol. 8, No. 1, January 1998.

Generating a pseudo-random floating-point number is similar to generating a pseudo-random integer. Consider this code:

double x = (6.25 * objRan.NextDouble()) + 1.50; Console.WriteLine("A random double in the range [1.50,7.75) is " + x.ToString("0.00"));

The call to Random.NextDouble returns a number greater than or equal to 0.0 and strictly less than 1.0. To generate a floating-point type number in the range [low, high) where the bracket-parens notation means greater than or equal to "low" and strictly less than "high," you can use the mini-formula:

double result = ((high-low) * objRan.NextDouble()) + low;

Because, with the exception of 0.0, most floating-point type values are approximations, you cannot technically generate a floating-point type number in the inclusive range [low, high] and must settle for an exclusive range [low, high). This formula is almost identical to the one used in the overload of Next that accepts an integral range, except with the result cast to an integer.

Analyzing a Pattern for Randomness

Occasionally, you may need to examine a test case input or output pattern for evidence that it was generated randomly (for example, to test that a gambling simulation is, in fact, outputting random results). There are several statistical techniques for doing this, but the simplest is the Wald-Wolfowitz test. This test is sometimes called a one-sample run test (where "run" is a continuous series of the same number or character). The Wald-Wolfowitz test applies to a sequence of two symbols, for example:

The principle is that for a given number of each of the two symbol types in a pattern, if the symbols are generated randomly, (meaning in this example each position in the pattern has a 50 percent chance of being an A or a B), then you can calculate an expected number of runs in the pattern. A pattern run is a sequence where the symbol types are the same. So in the pattern just shown, there are six runs: A, BB, AAA, BBB, A, BB. If the actual number of runs in a pattern is either too large or too small then there is evidence that that pattern was not generated randomly.

Although the Wald-Wolfowitz test applies only to patterns with just two symbol types, you can often map arbitrary patterns to a Wald-Wolfowitz form. For example, if you have a sequence of integers such as {7, 9, 3, 4, 1, 6}, you can compute their average (5.0) and then map the sequence to symbols H and L, where H represents a number higher than the average and L is a number lower than the average: H H L L L H. If you need to analyze more complex patterns, you can use tests such as the Chi-Square test or the Kolmogorov test.

Now, let's assume that n1 is the number of the first type of symbol in a pattern and n2 is the number of the second kind of symbol. The expected number of runs, µ, in a randomly generated pattern is:

µ = 1 + ((2*n1*n2) / (n1 + n2))

and the variance, α

2, is given by:

α2 = ((2*n1*n2) * (2*n1*n2 - n1 - n2)) / ((n1 + n2)2 * (n1 + n2 - 1))

These two formulas come from probability theory. We can use the mean and variance to determine the probability that a particular pattern is the result of a random process. Suppose we start with a pattern of two symbols, expressed as a string:

string s = "XOOOXXXXOXXXXXXXOOOOXXXXXXXXXX";

Although I can manually count the number of Xs and Os, and the number of runs in the pattern, it's easier to let my program do it. First I determine the two symbol types in the pattern string:

char kind1 = s[0], kind2 = s[0]; for (int i = 0; i < s.Length && kind1 == kind2; ++i) { if (s[i] != kind1) kind2 = s[i]; }

I assign the first character in the pattern string to the first kind of symbol, and then I scan through the pattern string until I find a second symbol type. Next I perform two simple error checks to make sure that I have at least two different characters in the pattern and that there aren't more than two characters in the pattern:

if (kind2 == kind1) throw new Exception("String must have two different types"); for (int i = 0; i < s.Length; ++i) if (s[i] != kind1 && s[1] != kind2) throw new Exception("String may only have two types");

If performance is a concern, you can recast these three passes through the pattern string into just one, at the expense of a little clarity.

Now I'm ready to count the number of each type of symbol in the pattern and the number of runs:

int n1 = 0, n2 = 0, runs = 1; for (int i = 0; i < s.Length-1; ++i) { if (s[i] == kind1) ++n1; else if (s[i] == kind2) ++n2; if (s[i+1] != s[i]) ++runs; } if (s[s.Length-1] == kind1) ++n1; else if (s[s.Length-1] == kind2) ++n2;

I scan the pattern, starting at the first character, continuing through the next-to-last character. If the current character matches the symbol type I determined earlier, then I increment the appropriate counter. To calculate the number of runs in the pattern, I use the fact that a run is determined by a change in the symbol type. If the current character is different from the next character, I know I have another run and I increment the runs counter. Because I stopped at the next-to-last character in the pattern string, I finish by examining the last character. I also start the runs counter at one instead of zero since all patterns have a least one run by definition.

The Wald-Wolfowitz test is only valid if the number of each type of symbol in the pattern under analysis is eight or greater, so I perform that check:

if (n1 < 8 || n2 < 8) throw new Exception("n1 and n2 must both be 8 or greater for " + "this test to be meaningful");

At this point in the analysis, I have the number of each of the two symbol types and the actual number of runs in the pattern. Now I calculate the expected number of runs in the pattern if the two symbol types were generated randomly:

double expectedRuns = 1 + ((2.0*n1*n2) / (n1 + n2));

Then I calculate the variance of the number of runs if generated randomly, as shown here:

double varianceNumerator = (2.0*n1*n2) * (2.0*n1*n2 - N); double varianceDenominator = (double)((N*N)*(N-1)); double variance = varianceNumerator / varianceDenominator;

The next step in the analysis is to compute a normalized test statistic, z:

double z = (R - expectedRuns) / Math.Sqrt(variance);

The z statistic is equal to the actual number of runs in the pattern, minus the expected number of runs in the pattern, divided by the standard deviation of the expected number of runs (which is just the square root of the variance). Interpreting the normalized number of runs in a pattern is easier than interpreting the actual number of runs. The interpretation code is simple, but somewhat subtle conceptually. It begins like this:

if (z < -2.580 || z > 2.580) { Console.Write("Strong evidence (1%) pattern is not random: "); Console.WriteLine(z < -2.580 ? "Too few runs." : Too many runs."); }

I perform a so-called two-tailed test at the one percent significance level. If the normalized test statistic is greater in absolute magnitude than 2.580, then this means, loosely speaking, that there is less than a one percent chance that the pattern under analysis was generated by a random process. The value 2.580 comes from statistical tables. If the test statistic is negative, the actual number of runs is smaller than the expected number of runs, and vice versa. In **Figure 2**, I also look for weaker evidence that the pattern is not random. Notice you cannot ever state with certainty that a given pattern was generated randomly; you can only test to see if there is statistical evidence that the pattern was not generated randomly.

Figure 2 Evidence That a Pattern is Not Random

if (z < -2.580 || z > 2.580) { Console.Write("Strong evidence (1%) pattern is not random: "); Console.WriteLine(z < -2.580 ? "Too few runs." : Too many runs."); } else if (z < -1.960 || z > 1.960) { Console.Write("Moderate evidence (5%) pattern is not random: "); Console.WriteLine(z < -1.960 ? "Too few runs." : "Too many runs."); } else if (z < -1.645 || z > 1.645) { Console.Write("Mild evidence (10%) pattern is not random: "); Console.WriteLine(z < -1.645 ? "Too few runs." : "Too many runs."); } else Console.WriteLine("Insufficient evidence to suggest pattern " + "is not random");

Shuffling a List of Items

Let's take a look at shuffling a list of items. This is useful when you have a collection of test case inputs and you want to feed all of them to a system under test in a random order. You can think of shuffling a list as generating a random permutation of the items. This surprisingly tricky problem has been the subject of quite of bit of research. The best general-purpose shuffling algorithm is called the Fisher-Yates algorithm. It is also sometimes called the Knuth shuffle. This algorithm is extremely simple. Suppose I have an array of items:

string[] animals = new string[] { "ant", "bat", "cow", "dog", "elk", "fox" };

Shuffling that list of animals using the most common form of the Fisher-Yates algorithm will look like this:

for (int i = 0; i < animals.Length; i++) { int r = objRan.Next(i, animals.Length); string temp = animals[r]; animals[r] = animals[i]; animals[i] = temp; }

I iterate through each index of the array to be shuffled. I select a random location between the current index and the end of the array and swap the items at the current index and the random index. Incorrect shuffling algorithms are very common. The following is particularly tricky. Consider this attempt:

for (int i = 0; i < animals.Length; i++) { // int r = objRan.Next(i, animals.Length); // correct int r = objRan.Next(0, animals.Length); // incorrect string temp = animals[r]; animals[r] = animals[i]; animals[i] = temp; }

This code will build and execute, but the resulting rearranged list will be biased towards certain arrangements of items. To illustrate, suppose the original list to be shuffled has just three items, {ABC}. The goal of a shuffle is to produce a random arrangement of the three items, where each arrangement is equally likely. The table in

**Figure 3** shows all 27 possible results using the incorrect shuffling algorithm just shown.

Figure 3 Results of an Incorrect Shuffling Algorithm

1st Pass | 2nd Pass | 3rd Pass | Result |
---|

(i=0, r=0) | (i=1, r=0) | (i=2, r=0) | {CAB} |

(i=0, r=0) | (i=1, r=0) | (i=2, r=1) | {BCA} |

(i=0, r=0) | (i=1, r=0) | (i=2, r=2) | {BAC} |

(i=0, r=0) | (i=1, r=1) | (i=2, r=0) | {CBA} |

(i=0, r=0) | (i=1, r=1) | (i=2, r=1) | {ACB} |

(i=0, r=0) | (i=1, r=1) | (i=2, r=2) | {ABC} |

(i=0, r=0) | (i=1, r=2) | (i=2, r=0) | {BCA} |

(i=0, r=0) | (i=1, r=2) | (i=2, r=1) | {ABC} |

(i=0, r=0) | (i=1, r=2) | (i=2, r=2) | {ACB} |

(i=0, r=1) | (i=1, r=0) | (i=2, r=0) | {CBA} |

(i=0, r=1) | (i=1, r=0) | (i=2, r=1) | {ACB} |

(i=0, r=1) | (i=1, r=0) | (i=2, r=2) | {ABC} |

(i=0, r=1) | (i=1, r=1) | (i=2, r=0) | {CAB} |

(i=0, r=1) | (i=1, r=1) | (i=2, r=1) | {BCA} |

(i=0, r=1) | (i=1, r=1) | (i=2, r=2) | {BAC} |

(i=0, r=1) | (i=1, r=2) | (i=2, r=0) | {ACB} |

(i=0, r=1) | (i=1, r=2) | (i=2, r=1) | {BAC} |

(i=0, r=1) | (i=1, r=2) | (i=2, r=2) | {BCA} |

(i=0, r=2) | (i=1, r=0) | (i=2, r=0) | {ACB} |

(i=0, r=2) | (i=1, r=0) | (i=2, r=1) | {BAC} |

(i=0, r=2) | (i=1, r=0) | (i=2, r=2) | {BCA} |

(i=0, r=2) | (i=1, r=1) | (i=2, r=0) | {ABC} |

(i=0, r=2) | (i=1, r=1) | (i=2, r=1) | {CAB} |

(i=0, r=2) | (i=1, r=1) | (i=2, r=2) | {CBA} |

(i=0, r=2) | (i=1, r=2) | (i=2, r=0) | {BAC} |

(i=0, r=2) | (i=1, r=2) | (i=2, r=1) | {CBA} |

(i=0, r=2) | (i=1, r=2) | (i=2, r=2) | {CAB} |

The first row of the table in

**Figure 3** means that on the first iteration through the shuffling loop, the value of i is 0, and r, the random index, is 0. Because the initial list is {ABC}, after the swap, the list is still {ABC}. On the second iteration, i is 1 and the random index is 0. After the swap, the list is now {BAC}. On the last iteration, i is 3 and the random index is 0. After the swap, the final list ordering is {CAB}. There are six possible final arrangements of the three items. If you count how many times each result occurs in the table in

**Figure 4** you'll get:

{ABC} = 4 times {ACB} = 5 times {BAC} = 5 times {BCA} = 5 times {CAB} = 4 times {CBA} = 4 times

Figure 4 Results of a Correct Shuffling Algorithm

1st Pass | 2nd Pass | 3rd Pass | Result |
---|

(i=0, r=0) | (i=1, r=1) | (i=2, r=2) | {ABC} |

(i=0, r=0) | (i=1, r=2) | (i=2, r=2) | {ACB} |

(i=0, r=1) | (i=1, r=1) | (i=2, r=2) | {BAC} |

(i=0, r=1) | (i=1, r=2) | (i=2, r=2) | {BCA} |

(i=0, r=2) | (i=1, r=1) | (i=2, r=2) | {CBA} |

(i=0, r=2) | (i=1, r=2) | (i=2, r=2) | {CAB} |

Or in other words, not all arrangements are equally likely. And notice that A occurs in the first position 9 times, B occurs in the first position 10 times, and C occurs in the first position 8 times. If you used this incorrect shuffling algorithm in a gambling game simulation, you would have a serious problem.

With the correct shuffling algorithm, you end up with the possible results shown in **Figure 4** (notice that an r value is never less than an i value). This time, the probability of achieving each of the six possible final arrangements of the three items is equal, as are the probabilities that a letter will appear in a particular position. Also note that the third pass doesn't need to be done, since it can only ever swap the last value in the array with itself. Hence, the loop in the correct algorithm can go to animals.Length - 1 rather than to animals.Length.

Generating Normal/Gaussian Numbers

The fourth technique I'll demonstrate in this month's column is generating numbers from a bell-shaped distribution, usually called the normal or Gaussian distribution.

Suppose you want to generate some test case input data that corresponds to the heights of a group of people. A very clever technique called the Box-Muller algorithm is one way to generate normally distributed pseudo-random numbers. The code that created the output shown in

**Figure 1** begins with:

Gaussian g = new Gaussian(); double ht; int outliers = 0; Console.WriteLine("Generating 100 random heights from a " + "normal/Gaussian population with " + "mean = 68.0 in. and sd = 6.0 in. ->");

I instantiate a program-defined Gaussian object. This object does all the work and uses the Box-Muller algorithm. The variable ht will hold a normally distributed value. I also initialize a counter to keep track of outlier values–those that are far above or below the mean height. Keeping track of outliers will give me a rudimentary way to verify that my random heights are, in fact, normally distributed.

**Figure 5** lists the code that generates and displays my random heights.

Figure 5 Counting the Outliers

for (int i = 0; i < 100; ++i) { if (i % 10 == 0) Console.WriteLine(); ht = g.NextGaussian2(68.0, 6.0); Console.Write(ht.ToString("0.00") if (ht < 56.0 || ht > 80.0) { Console.Write("* "); ++outliers; } else Console.Write(" "); }

I print a new line every 10 values by using the modulus (%) operator, just to keep my output tidy. The random height from a normal distribution with mean 68.0 inches and standard deviation 6.0 inches is returned by a call to the Gaussian.NextGaussian2 method, which I'll describe in detail in a moment. I keep track of outliers by watching for values that are less than 56.0 inches or greater than 80.0 inches. These are values that are two standard deviations (6.0 * 2 = 12.0 inches) above or below the mean of 68.0 inches. Statistically, there is about a five percent probability that a randomly generated value is outside of plus or minus two standard deviations of the mean. So if I generate 100 random heights, as I'm doing here, I expect about five outliers. If I get many more or many fewer than five, I need to double-check my code. Notice that in the sample run that is shown in **Figure 1**, I get exactly five outliers, which gives me some reassurance that my randomly generated height data points are in fact normally distributed.

The principle behind the Box-Muller algorithm is brilliant, but the result is fairly simple. If you have two uniform random numbers, U1 and U2, in the range [0,1) (as described in the first section of this column), then you can compute a normally distributed random number, Z, with either one of these two equations:

Z = R * cos( θ ) or Z = R * sin( θ )

where

R = sqrt(-2 * ln(U2)) and θ = 2 * π * U1

The normal value Z has a mean equal to 0 and a standard deviation equal to 1, and you can map Z to a statistic X with mean m and standard deviation sd using the equation:

The simplest way to implement a Gaussian class with a NextGaussian method is represented by the code in

**Figure 6**.

Figure 6 Gaussian Implementation

class Gaussian { private Random r = new Random(); public double NextGaussian1(double mean, double sd) // least efficient { double u1 = r.NextDouble(); double u2 = r.NextDouble(); double left = Math.Cos(2.0 * Math.PI * u1); double right = Math.Sqrt(-2.0 * Math.Log(u2)); double z = left * right; return mean + z * sd; } }

I use the Math.Cos version, but I could just have easily used the Math.Sin version. This implementation works but is quite inefficient. Because the Box-Muller algorithm can compute a normally distributed Z value using either the sin or cos function, I might as well compute two Z values at the same time, save the second Z value, and then on a second call to the NextGaussian method, I can retrieve the saved value. Such an implementation is presented in **Figure 7**.

Figure 7 A More Efficient Box-Muller Implementation

class Gaussian { private Random r = new Random(); private bool use_last_result = false; // flag for NextGaussian2() private double z2 = 0.0; // secondary result for NextGaussian2() public double NextGaussian2(double mean, double sd) // semi-efficient { if (use_last_result) { use_last_result = false; return mean + z2 * sd; } else { double u1 = r.NextDouble(); double u2 = r.NextDouble(); double left1 = Math.Cos(2.0 * Math.PI * u1 ); double left2 = Math.Sin(2.0 * Math.PI * u1 ); double right = Math.Sqrt(-2.0 * Math.Log(r2)); double z1 = left1 * right; z2 = left2 * right; use_last_result = true; return mean + z1 * sd; } } }

While this approach works well, it also has some inefficiency. Computing using the Math.Sin, Math.Cos, and Math.Log methods degrades performance. A very clever efficiency improvement is to use a mathematical trick. If you review the definitions of R and θ, it turns out that they correspond to the polar coordinates of a random point inside a unit circle. The math trick is to compute the coordinates of a random point inside the unit square (which avoids calls to the Math.Sin and Math.Cos methods) and determine if that random point falls inside the unit circle. If so, then we can use the coordinates; if not, then we compute a new set of random coordinates and try again. About 78 percent of randomly generated coordinates will be inside the unit circle, and this provides better performance, obviously at the expense of clarity.

The unit square trick is illustrated in **Figure 8**. The basic Box-Muller algorithm selects a point with polar coordinates (R, θ) that is guaranteed to be inside the unit circle. Alternatively, you can choose rectangular coordinates inside the unit square that encompasses the unit circle; the point (x1, y1) is inside the unit circle, but the point (x2, y2) falls outside the unit circle. An implementation of the unit square approach is listed in **Figure 9**.

Figure 9 Unit Square Implementation

class Gaussian { private Random r = new Random(); private bool use_last_result = false; // flag for NextGaussian3() private double y2 = 0.0; // secondary result for NextGaussian3() public double NextGaussian3(double mean, double sd) // most efficient { double x1, x2, w, y1 = 0.0; if (use_last_result) // use answer from previous call? { y1 = y2; use_last_result = false; } else { do { x1 = 2.0 * r.NextDouble() - 1.0; x2 = 2.0 * r.NextDouble() - 1.0; w = (x1 * x1) + (x2 * x2); } while (w >= 1.0); // are x1 and x2 inside unit circle? w = Math.Sqrt( (-2.0 * Math.Log(w) ) / w ); y1 = x1 * w; y2 = x2 * w; use_last_result = true; } return mean + y1 * sd; } }

Figure 8** Unit Square Trick **

In software testing scenarios, performance is often not a major concern, so any one of the three implementations I presented will be acceptable. But in software development, particularly if you are performing a simulation, performance can be a major issue. Although the Box-Muller algorithm is both efficient and relatively simple to implement, there are alternative algorithms that generate normal/Gaussian pseudo-random numbers. One high-efficiency alternative is called the ziggurat method.

Wrapping Up

Let me briefly summarize. The ability to generate random test case input data is an essential software testing skill. The .NET Framework contains a System.Random class that you can use to generate uniformly distributed integer or floating point pseudo-random numbers in a particular range. The major pitfall to watch for is to make sure you specify the endpoints of the range properly.

The Wald-Wolfowitz test can be used to analyze a pattern that contains two symbols for evidence that the pattern was generated randomly. You can use this test to analyze your random test case input or perform an analysis of the output of a system under test.

The best general purpose shuffling algorithm is the Fisher-Yates algorithm. It is very simple and there is rarely any need to use a different technique. However, even a tiny deviation from the correct algorithm can produce an algorithm that appears correct but is fatally flawed.

You can use the Box-Muller algorithm to generate pseudo-random numbers with a normal distribution. The math behind the Box-Muller algorithm is deep but the implementation is surprisingly short. There are several ways to implement the Box-Muller algorithm, trading off efficiency for clarity.

Send your questions and comments for James to testrun@microsoft.com.

**Dr. James McCaffrey** works for Volt Information Sciences Inc., where he manages technical training for software engineers working at Microsoft. He has worked on several Microsoft products including Internet Explorer and MSN Search. James can be reached at

jmccaffrey@volt.com or

v-jammc@microsoft.com.