December 2014

Volume 29 Number 12

Test Run : Fireworks Algorithm Optimization

James McCaffrey

James McCaffreyMany machine learning software systems use numerical optimization. For example, in logistic regression classification, training the classifier is the process of finding a set of values for the weights associated with the input variables so that for a collection of training data, the computed output values closely match the known output variable values. In other words, the goal is to optimize (minimize) the error between computed and desired output values.

There are many different numerical optimization algorithms. Back-propagation is based on classical calculus techniques and is often used with neural networks. Particle swarm optimization mimics group behavior such as the flocking of birds. Evolutionary algorithm optimization, a form of genetic algorithm optimization, models the biological processes of chromosome replication.

This article describes a relatively new (first published in 2010) optimization technique called fireworks algorithm optimization (FAO). The technique doesn’t explicitly model or mimic the behavior of fireworks, but when the algorithm is visually displayed, the resulting image resembles the geometry of exploding fireworks.

The best way to see where this article is headed and to get an idea of what FAO is, is to take a look at the demo program in Figure 1. The goal of the demo program is to use FAO to find the minimum value of Ackley’s function with 10 variables. Ackley’s function is a standard benchmark for evaluating the effectiveness of numerical optimization algorithms. The demo version has a minimum value of 0.0 located at (0, 0, 0, 0, 0, 0, 0, 0, 0, 0). Ackley’s function is tricky because it has many local minimum solutions that can trap search algorithms before finding the one global minimum. A graph of Ackley’s function of two variables is shown in Figure 2.

Fireworks Algorithm Optimization in Action
Figure 1 Fireworks Algorithm Optimization in Action

Ackley’s Function of Two Variables
Figure 2 Ackley’s Function of Two Variables

The demo program sets the number of fireworks to 5. More fireworks increase the chance of finding a true optimal solution, at the expense of performance. FAO typically works well with 3 to 10 fireworks. FAO is an iterative process and requires a maximum loop counter value. A loop counter variable in machine learning optimization is often named “epoch” and the demo sets the maximum value to 1,000 iterations. This is a bit small, intended for demo purposes, and in realistic scenarios, values from 10,000 to 100,000 are typical.

In the demo run in Figure 1, the best (smallest) error associated with the best position found so far is displayed every 100 epochs. Notice that FAO starts converging very quickly. After 1,000 epochs, the best position found is (0.034 0.098 0.003 0.132 -0.054 0.181 -0.018 0.051 0.004 -0.023) and the associated error is 0.41. If the maximum number of epochs is increased to 10,000, then FAO will in fact find the optimal solution. FAO, like most numerical optimization algorithms, isn’t guaranteed to find an optimal solution in realistic scenarios where the optimal solution isn’t known.

This article assumes you have at least intermediate programming skills, but doesn’t assume you know anything about numerical optimization or FAO. The demo program is coded using C#, but you shouldn’t have too much difficulty refactoring the code to another language, such as JavaScript or Python.

The demo code is a bit too long to present in its entirety, but the complete source code is available in the code download that accompanies this article. The demo code has most normal error checking removed to keep the main ideas as clear as possible and the size of the code small.

Overall Program Structure

The overall program structure, with a few minor edits to save space, is presented in Figure 3. To create the demo, I launched Visual Studio and created a new C# console application named Fireworks­Algorithm. The demo has no significant .NET dependencies, so any recent version of Visual Studio will work.

Figure 3 Overall Program Structure

using System;
using System.Collections.Generic;
namespace FireworksAlgorithm
{
  class FireworksProgram
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nBegin fireworks algorithm demo\n");
      Console.WriteLine("Goal is to solve Ackley's function");
      Console.WriteLine("Function has min value = 0.0 at (0, .. ))");
      int dim = 10;
      int n = 5;    // Number of fireworks
      int maxEpochs = 10000;
      Console.WriteLine("\nSetting Ackley's dimension to " + dim);
      Console.WriteLine("Setting number fireworks to " + n);
      Console.WriteLine("Setting maxEpochs to " + maxEpochs);
      Console.WriteLine("\nBegin algorithm\n");
      double[] bestPosition = Solve(dim, n, maxEpochs);
      Console.WriteLine("\nAlgorithm complete");
      Console.WriteLine("\nBest solution found: ");
      for (int i = 0; i < dim; ++i)
        Console.Write(bestPosition[i].ToString("F3") + " ");
      Console.WriteLine();
      double error = Error(bestPosition);
      Console.WriteLine("\nError of best solution found = " +
        error.ToString("F5"));
      Console.WriteLine("\nEnd fireworks algorithm demo\n");
      Console.ReadLine();
    } // Main
    public static double Error(double[] position) { . . }
    public static double[] Solve(int dim, int n, int maxEpochs) { . . }
    private static int[] PickDimensions(int dim, int z, int seed) { . . }
    private static double YMax(Info[] fireworks) { . . }
    private static double YMin(Info[] fireworks) { . . }
    private static int[] NumberSparks(Info[] fireworks, int m,
      double a, double b) { . . }
    private static double[] Amplitudes(Info[] fireworks, int A,
      int epoch, int maxEpochs, double minX, double maxX) { . . }
    private static double MinAmplitude(int epoch, int maxEpochs,
      double minX, double maxX) { . . }
    private static void AddGaussianSparks(Info[] fireworks,
      List<Info>[] sparksList, int dim, int mHat, int epoch, double minX,
      double maxX, double[] bestPosition, ref double bestError, Random rnd)
  }
  public class Info
  {
    public double[] position;
    public double error;
  }
} // ns

After the template code loaded into the Visual Studio editor, in the Solution Explorer window I renamed file Program.cs to the more descriptive FireworksProgram.cs and Visual Studio automatically renamed class Program for me. At the top of the source code, I deleted all using statements that pointed to unneeded namespaces, leaving just the references to System and Collections.Generic.

I coded the demo using a static-method approach rather than an object-oriented programming (OOP) approach. The demo has all the control logic in the Main method, which calls two public methods, Solve and Error. The essential calling statements are:

int dim = 10;
int n = 5;
int maxEpochs = 1000;
double[] bestPosition = Solve(dim, n, maxEpochs)
double error = Error(bestPosition);

The variable dim holds the number of problem dimensions. In the case of machine learning training, this would normally be the number of model weights to determine. Variable n is the number of fireworks. I use as much as possible the rather terse variable names, such as n, that are used in the research papers on FAO so you can use those papers as a resource more easily. The FAO technique is contained in method Solve. Method Error accepts a position (an array of 10 values), computes the value of Ackley’s function for those values, and returns the average of the sum of squared differences between the computed outputs and the known minimum of 0.0.

The demo program has a utility class named Info that encap­sulates a position and its associated error. The class has no associated methods, such as an explicit constructor, in order to make it easier for you to refactor the demo code to languages with limited OOP support.

Understanding the Algorithm

The fireworks algorithm process is illustrated in the graph in Figure 4. The image represents a simplified dummy minimization problem, not the Ackley’s function of the demo program. The goal of the dummy problem is to minimize some arbitrary function that has two input variables, X and Y, where the function has a minimum value of 0.0 located at X = 0 and Y = 0.

The Fireworks Algorithm
Figure 4 The Fireworks Algorithm

The image in Figure 4 uses three fireworks. Each firework has so-called sparks. There are two kinds of sparks, regular and Gaussian. Figure 5 shows the fireworks algorithm in very high-level pseudo-code.

Figure 5 The Fireworks Algorithm in High-Level Pseudo-Code

initialize 3 fireworks to random positions
loop until done
  for each firework
    calculate the amplitude of the firework
    calculate the number of regular sparks
    generate the regular sparks
  end for
  generate special Gaussian sparks
  evaluate each spark
  from the list of sparks,
    select 3 to act as positions of new fireworks
  create 3 new fireworks
end loop
return the position of the best spark found

For the image in Figure 4, the positions of the three fireworks are indicated by the colored dot markers at (-6, 2), (3, 4), and (3, -3). Notice that the first firework is the worst because it’s farthest away from the target position of (0, 0) and that the third firework is the best because it’s closest. The amplitudes are indicated by the dashed circles around each firework. Good fireworks have smaller amplitudes and bad fireworks have larger amplitudes.

Each firework will generate a different number of regular (non-Gaussian) sparks. Bad fireworks will generate fewer sparks than good fireworks. In Figure 4, firework[0] generates three sparks, firework[1] generates four sparks, and firework[2] generates five sparks. Each regular spark will be located within its parent firework’s amplitude. Because good fireworks have small amplitude and relatively many sparks, the algorithm search will focus near good fireworks. Bad fireworks shouldn’t be entirely neglected because they could lead to an optimal solution that’s located out of range of the current fireworks. The Gaussian sparks are generated in such a way that they tend to be located between a firework and the best-known position of any spark encountered during the search.

After all regular and Gaussian sparks have been generated, each is evaluated. New fireworks for the next round of explosions are selected from the current sparks. The position of the best spark is retained as one of the new fireworks to intensify search in good locations. The position of the worst spark is retained to maintain search diversity and prevent the algorithm from quickly collapsing onto a solution that may not be optimal. The positions of the remaining fireworks, just one firework in the simple example in Figure 4, are randomly selected from the current sparks.

The process of generating fireworks and then sparks is iterated until some stopping condition is met. The stopping condition is just a maximum loop counter value in the case of the demo program. When using FAO for machine learning training, the stopping condition might also include an error threshold so that when error drops below some small specified value that depends on the particular problem being solved, the processing loop terminates.

Implementing the Fireworks Algorithm

The definition of method Solve begins as:

public static double[] Solve(int dim, int n, int maxEpochs)
{
  int m = n * 10;
  int mHat = 5;
  double a = 0.04;
  double b = 0.8;
  int A = 40;
  double minX = -10.0;
  double maxX = 10.0;
...

Variable m is the total number of regular sparks, which will be allocated to the n fireworks. Variable mHat (so named because research papers use a lowercase m with a carat over it) is the number of special Gaussian sparks to generate. Variables a and b control the minimum and maximum number of sparks for any particular firework. Variable A is the maximum amplitude for any firework. Variables minX and maxX hold the smallest and largest values for any single value in an Info object’s position array.

Method Solve creates n initial fireworks, like so:

Random rnd = new Random(3);
Info[] fireworks = new Info[n];
for (int i = 0; i < n; ++i)
{
  fireworks[i] = new Info();
  fireworks[i].position = new double[dim];
  for (int j = 0; j < dim; ++j)
    fireworks[i].position[j] = (maxX - minX) * rnd.NextDouble() + minX;
  fireworks[i].error = Error(fireworks[i].position);
}

Random object rnd is initially using a seed value of 3 only because that value gave a representative demo run. Each of the n fireworks are stored as Info objects in an array. The initialization code picks random values between minX and maxX for each cell of the position array. For some specific machine learning training scenarios, it’s preferable to initialize the initial fireworks so they’re far apart from each other.

Method Solve continues by establishing variables to hold the best position found by any spark, and that position’s associated error. Unlike fireworks, which have a fixed number, the number of sparks per firework will vary in each iteration of the main processing loop, so sparks are stored in a List collection rather than an array:

double[] bestPosition = new double[dim];
for (int k = 0; k < dim; ++k)
  bestPosition[k] = fireworks[0].position[k];
double bestError = fireworks[0].error; // arbitrary
List<Info>[] sparksList = new List<Info>[n];
for (int i = 0; i < n; ++i)
  sparksList[i] = new List<Info>();

The main processing loop begins:

int epoch = 0;
while (epoch < maxEpochs)
{
  if (epoch % 100 == 0) // Show progress every 100 iterations
  {
    Console.Write("epoch = " + epoch);
    Console.WriteLine(" error at best known position = " +
      bestError.ToString("F4"));
  }
...

Here, progress is displayed every 100 epochs. You might want to consider passing a Boolean flag variable named “progress” to control whether progress messages are displayed:

int[] numberSparks = NumberSparks(fireworks, m, a, b);
double[] amplitudes = Amplitudes(fireworks, A, epoch, 
  maxEpochs, minX, maxX);
for (int i = 0; i < n; ++i)
  sparksList[i].Clear(); // Number of sparks changed

Next, the number of sparks for each firework, and the maximum amplitude for each firework, are calculated using helper methods NumberSparks and Amplitudes. The number of sparks for a firework is proportional to the error of the firework so that good fireworks receive a larger proportion of the m total regular sparks. Similarly, each amplitude is proportional to the error, so that good fireworks have smaller amplitudes.

Next, each spark is instantiated:

for (int i = 0; i < n; ++i)
{
  double amp = amplitudes[i]; // Amplitude for curr firework
  int ns = numberSparks[i];   // Number of sparks for curr firework
  for (int j = 0; j < ns; ++j) // Each spark for curr firework
  {
    Info spark = new Info(); // A spark has a position and error
    spark.position = new double[dim]; // Allocate space (ctor doesn't)
    for (int k = 0; k < dim; ++k) // Spark position based on its parent firework
      spark.position[k] = fireworks[i].position[k];

A spark’s position is based on its parent firework’s position. Next, a few (z) of the dimensions of the position array are randomly selected, and a random displacement is added to the position array:

int z = (int)Math.Round(dim * rnd.NextDouble());
int[] dimensions = PickDimensions(dim, z, epoch);
for (int ii = 0; ii < dimensions.Length; ++ii)
{
  double h = amp * 2 * rnd.NextDouble() - 1;
  int k = dimensions[ii]; // convenience
    spark.position[k] += h; // displace from parent
  if (spark.position[k] < minX || spark.position[k] > maxX)
    spark.position[k] = (maxX - minX) * rnd.NextDouble() + minX;
}
spark.error = Error(spar.position);
sparksList[i].Add(spark);

After generating a spark, method Solve checks to see if the new spark has the best position found so far:

if (spark.error < bestError)
    {
      bestError = spark.error;
      for (int k = 0; k < dim; ++k)
        bestPosition[k] = spark.position[k];
    }
  } // Each new regular spark
} // Each firework

Next, special Gaussian sparks are generated:

AddGaussianSparks(fireworks, sparksList, dim, mHat,
  epoch, minX, maxX, bestPosition, ref bestError, rnd);

Helper method AddGaussianSparks generates special sparks so that their positions tend to be between a randomly selected firework and the best-known position. Method Solve concludes by finding the best and worst spark generated and using their positions for two of the new fireworks for the next iteration. The remaining n-2 fireworks are created using randomly selected sparks:

...
    // Find best, worst spark
    // Create 2 new fireworks
    // Create remaining n-2 fireworks
    ++epoch;
  } // main loop
  return bestPosition;
} // Solve

See the code download for implementation details.

A Few Comments

The original paper that presented the fireworks algorithm is “Fireworks Algorithm for Optimization,” by Y. Tan and Y. Zhu (2010). That paper contained several errors and factors that made the algorithm impractical for real-life optimization. These flaws were quickly noted by other researchers. My article is based primarily on the 2013 research paper, “Enhanced Fireworks Algorithm,” by S. Zheng, A. Janecek and Y. Tan. You can find both papers in several locations on the Internet. I have made quite a few simplifications and minor modifications to the algorithms presented in both research papers. In my opinion, there isn’t enough research evidence yet to answer the question of whether fireworks optimization is better than or worse than other optimization algorithms. But it sure is interesting.


Dr. James McCaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products including Internet Explorer and Bing. Dr. McCaffrey can be reached at jammc@microsoft.com.

Thanks to the following Microsoft technical experts for reviewing this article: Todd Bello, Kenneth Griffin, Yong Liu, Brian Mellinger and Esin Saka