Export (0) Print
Expand All
Add Support for Digital Ink to Your Windows Applications
Power to the Pen: The Pen is Mightier with GDI+ and the Tablet PC Real-Time Stylus
Expand Minimize

Improving a Graphics Application with Tablet APIs

 

Stephen Toub
Microsoft Corporation

February 2007

Applies to:
   Microsoft .NET Framework 2.0
   Windows XP Tablet PC Edition

Summary: Demonstrates how to write an application for colorizing digital photos using the inking support provided by Tablet PC APIs. (18 printed pages)

Note Microsoft.Ink.dll is included in the sample code for this project. Therefore, the sample download may be compiled without having the Windows SDK installed. However, this is purely a design-time aid; without the Tablet PC Platform APIs properly installed, the application will not be able to take advantage of additional Tablet PC functionality. Click here to download source code for this article.

Contents

Overview
   Sample Application Inputs and Outputs
Creating the UI
   Adding Controls to the ToolStrip
   Loading and Saving Images
      Drag-and-Drop Support
      Delayed Computation
Backend Image Manipulation
   Grayscale Conversion
   Colorizing the Image
      Overhead Reduction
      Better Colorizing
      Partial Desaturation
Adding Ink Support
   UI Changes Accommodating Ink
      Initializing Ink
      Receiving Strokes
   Backend Changes Accommodating Regions
      Converting To GraphicsPaths
      Image Scaling
   Colorizing Regions
      Reducing Overhead
      A Better Selection Algorithm
Conclusions

Overview

I'm enamored with digital photography and the types of manipulations that can be made to an image after it's been taken. One of my favorite features on my camera allows me to isolate a specific color. Every picture taken using this feature is transformed to black and white, except for the portions of the image in the selected color: these portions remain in color (if you've seen the movie Pleasantville and the partial colorization it employs throughout, you have a good understanding of this feature).

Unfortunately, the usefulness of this feature is limited due to the timing of color selection. My camera requires me to select a color filter and then take a picture, in that order. Ideally, I'd like to apply the filter at a later point in time, trying out different colors and settings with previously snapped photos. While I could probably accomplish this with a few mouse clicks in software like Microsoft Digital Image Pro, as a developer, that feels a bit like cheating. In this article, I'll show you how to write your own Windows Forms colorizing application to accomplish this task. I'll then show you how to enhance it with the Tablet PC APIs.

Sample Application Inputs and Outputs

To give you a sense of what this application will be capable of, Figure 1 shows an image used as input. Clicking the green grass completely desaturates all colors in the image except for the greens (see Figure 2), while clicking my blue shirt desaturates all colors in the image except for the blues (see Figure 3).

Figure 1. Original image loaded into ImageColorizer

Figure 2. Greens isolated in image

Figure 3. Blues isolated in image

Creating the UI

My first course of action in building an application like this is to create the user interface, or at least a bare bones version of one. Having a user interface makes it easier to immediately test and visualize the results of the backend code as I write it.

The UI for the Image Colorizer application is fairly simple. Figure 4 shows the document outline view for the main form.

Figure 4. Document outline for mainform

As shown in Figure 4, MainForm contains a ToolStripContainer. Its content panel houses a PictureBox named pbImage, which displays the image being manipulated. This PictureBox has its SizeMode property set to StretchImage, which means that the image will always be stretched to fill the PictureBox, regardless of its original size and aspect ratio. The ToolStrip (named toolStripMain) in the ToolStripContainer contains several buttons for loading and saving images, a TrackBar controlling colorization settings (specifically the hue variation setting, to be explained shortly), and a ProgressBar showing the progress of recomputation each time an image is manipulated.

Adding Controls to the ToolStrip

A ToolStrip can be populated with ToolStripItem types such as the ToolStripButton and ToolStripLabel. The list of ToolStripItem types included with Windows Forms is short, but it's possible to add arbitrary controls to the ToolStrip using the ToolStripControlHost.

For example, to host a TrackBar in the ToolStrip, I created the following ToolStripTrackBar class, based on code from Jessica Fosler's excellent blog (http://go.microsoft.com/fwlink/?LinkID=80098).

[System.ComponentModel.DesignerCategory("code")]
[ToolStripItemDesignerAvailability(
    ToolStripItemDesignerAvailability.ToolStrip | 
    ToolStripItemDesignerAvailability.StatusStrip)]
internal partial class ToolStripTrackBar : ToolStripControlHost
{
    public ToolStripTrackBar() : 
        base(CreateControlInstance()) {}

    private static Control CreateControlInstance()
    {
        TrackBar t = new TrackBar();
        t.AutoSize = false;
        t.Height = 16;
        t.TickStyle = TickStyle.None;
        t.Minimum = 0;
        t.Maximum = 100;
        t.Value = 0;
        return t;
    }

    public TrackBar TrackBar 
    { 
        get { return Control as TrackBar; } 
    }

    [DefaultValue(0)]
    public int Value 
    { 
        get { return TrackBar.Value; } 
        set { TrackBar.Value = value; } 
    }

    [DefaultValue(0)]
    public int Minimum 
    { 
        get { return TrackBar.Minimum; } 
        set { TrackBar.Minimum = value; } 
    }

    [DefaultValue(100)]
    public int Maximum 
    { 
        get { return TrackBar.Maximum; } 
        set { TrackBar.Maximum = value; } 
    }

    protected override void OnSubscribeControlEvents(
        Control control)
    {
        base.OnSubscribeControlEvents(control);
        ((TrackBar)control).ValueChanged += 
            trackBar_ValueChanged;
    }

    protected override void OnUnsubscribeControlEvents(
        Control control)
    {
        base.OnUnsubscribeControlEvents(control);
        ((TrackBar)control).ValueChanged -= 
            trackBar_ValueChanged;
    }

    void trackBar_ValueChanged(object sender, EventArgs e)
    {
        if (ValueChanged != null) ValueChanged(sender, e);
    }

    public event EventHandler ValueChanged;

    protected override Size DefaultSize 
    { 
        get { return new Size(200, 16); } 
    }
}

Loading and Saving Images

The application maintains images in two form variables of type Bitmap. The first variable, _originalImage, stores the original image as loaded by the user, and the second variable, _colorizedImage, stores the output result of colorizing the image.

Loading an image into the application is straightforward. When a user clicks the Load Image button in the UI, the btnLoadImage_Click method is called. This method presents an OpenFileDialog to users, prompting them to pick an input image. The path to this image is then passed to the LoadImage helper method. LoadImage creates a new Bitmap from the image, stores it into _originalImage, and displays it in the PictureBox on the form.

private void btnLoadImage_Click(object sender, EventArgs e)
{
    if (_ofd == null)
    {
        _ofd = new OpenFileDialog();
        _ofd.Filter = "Image files (*.jpg, *.bmp, *.png, 
                       *.gif)|*.jpg;*.bmp;*.png;*.gif";
        _ofd.InitialDirectory = Environment.GetFolderPath(
             Environment.SpecialFolder.MyPictures);
    }
    if (_ofd.ShowDialog() == DialogResult.OK) 
        LoadImage(_ofd.FileName);
}

private void LoadImage(string path)
{
    _originalImage = new Bitmap(path);
    pbImage.Image = _originalImage;
    ...
}

When the user clicks the Save Image button, the btnSaveImage_Click method opens a SaveFileDialog box. The method then passes the user's selected target file path and _colorizedImage to the SaveImage method. SaveImage saves the image to disk.

private void btnSaveImage_Click(object sender, EventArgs e)
{
    if (_colorizedImage != null)
    {
        SaveFileDialog sfd = new SaveFileDialog();
        sfd.Filter = "Image files (*.jpg, *.bmp, *.png, 
                      *.gif)|*.jpg;*.bmp;*.png;*.gif|
                      All files (*.*)|*.*";
        sfd.DefaultExt = ".jpg";
        if (sfd.ShowDialog(this) == DialogResult.OK)
        {
            SaveImage(_colorizedImage, sfd.FileName, 100);
        }
    }
}

private static void SaveImage(
    Bitmap bmp, string path, long quality)
{
    if (bmp == null) throw new ArgumentNullException("bmp");
    if (path == null) throw new ArgumentNullException("path");
    if (quality < 1 || quality > 100) 
        throw new ArgumentOutOfRangeException(
            "quality", quality, "Quality out of range.");

    switch (Path.GetExtension(path).ToUpperInvariant())
    {
        default:
        case ".BMP": bmp.Save(path, ImageFormat.Bmp); break;
        case ".PNG": bmp.Save(path, ImageFormat.Png); break;
        case ".GIF": bmp.Save(path, ImageFormat.Gif); break;
        case ".JPG":
            ImageCodecInfo jpegCodec = Array.Find(
                ImageCodecInfo.GetImageEncoders(),
                delegate(ImageCodecInfo ici) { 
                    return ici.MimeType == "image/jpeg"; 
                });
            using (EncoderParameters codecParams = 
                new EncoderParameters(1))
            using (EncoderParameter ratio = new EncoderParameter(
                Encoder.Quality, quality))
            {
                codecParams.Param[0] = ratio;
                bmp.Save(path, jpegCodec, codecParams);
            }
            break;
    }
}

If the selected file type is BMP, PNG, or GIF, the SaveImage method calls an overload of the Bitmap.Save method that accepts both the file path and the appropriate ImageFormat value. I could use the same approach when saving JPG files, but I prefer to allow the caller to specify a JPEG quality value, where 0 is worst quality but greatest compression and 100 is best quality but least compression. In order to save quality values, I use another overload of Bitmap.Save, one that accepts an ImageCodecInfo and an EncoderParameters. The EncoderParameters details the specifics of how an image is saved.

First, I retrieve the appropriate ImageCodecInfo instance, examining codecs returned by ImageCodecInfo.GetImageEncoders for one with a MimeType equal to "image/jpeg".

Note The new generics-based Array.Find<T> method is very handy here.

Next, I create an EncoderParameter, which specifies the target quality of the image, and I wrap the EncoderParameter in an EncoderParameters collection. I supply this collection, along with the JPG ImageCodecInfo, to the Bitmap.Save method.

Drag-and-Drop Support

The user should be able to load images two ways: clicking a load button on the ToolStrip, or dragging an image into the application's PictureBox. Since loading support has already been factored out into the LoadImage method, image drag-and drop can be supported easily with the addition of two event handlers: one for the PictureBox's DragEnter event and one for the DragDrop event.

In the DragEnter event handler, the data being dragged must be good enough to warrant changing the mouse cursor (thus, signaling users that the drag-and-drop operation is supported). Therefore, I check whether the data (accessed through the DragEventArgs.Data property) contains DataFormats.FileDrop data. If GetDataPresent for FileDrop returns true, I retrieve the data, ensuring that only one file is being dragged. If both of those conditions are true, I set the DragEventArgs.Effect property to DragDropEffects.Copy; this allows the drag-and-drop operation to continue, changing the mouse cursor accordingly.

private void pbImage_DragEnter(object sender, DragEventArgs e)
{
    if (e.Data.GetDataPresent(DataFormats.FileDrop) &&
        ((string[])e.Data.GetData(
            DataFormats.FileDrop)).Length == 1)
    {
        e.Effect = DragDropEffects.Copy;
    }
}

The second event handler is for the DragDrop event, which is raised when the user releases the mouse button to complete the drag-and-drop operation. For good measure, I double-check the same constraints I checked for in DragEnter. Assuming all is well, I pass the data retrieved from GetData into the same LoadImage method used earlier.

private void pbImage_DragDrop(object sender, DragEventArgs e)
{
    if (e.Data.GetDataPresent(DataFormats.FileDrop) &&
        e.Effect == DragDropEffects.Copy)
    {
        string[] paths = (string[])
            e.Data.GetData(DataFormats.FileDrop);
        if (paths.Length == 1) LoadImage(paths[0]);
    }
}

Delayed Computation

When a user clicks a pixel in the displayed image, the Image Colorizer application should remember that pixel's color. Next, the loaded image should be colorized on a background thread, using that selected color. While the colorization operation is in progress, the UI should be disabled so that no other operation can start. The BackgroundWorker component on the MainForm, named bwColorize (you can see it in the document outline in Figure 4), is used to provide this background operation support.

However, the image shouldn't regenerate as soon as something changes. A user might make multiple changes in quick succession, and only after all changes have been entered should the image be recomputed. That's why all of the UI interaction event handlers (clicking on a pixel, for example) start a Timer (named tmRefresh in Figure 4) that expires after one second. When that timer expires, the application regenerates the image. If a user makes a change and the timer has already started, the application restarts the timer. In other words, all color manipulations are delayed for at least a second, and the image won't be regenerated unless no changes have happened for at least a second. This way, users can make more than one change without triggering (and waiting for) image regeneration.

Backend Image Manipulation

Now that our application framework is defined, we can implement the image coloring algorithm. The basic algorithm is quite straightforward. We create a new image the same size as the original one and iterate over every pixel, copying each from the source to the destination, making a color change if necessary. If a particular pixel is a different color than the selected color, it is converted to grayscale. Otherwise, its original color is retained.

Grayscale Conversion

The simplest way I know to convert a System.Drawing.Color structure to grayscale is to add the red, green, and blue components of the color, and divide the result by three. A new Color is then created using this average value for all three components.

private static Color ToGrayscale(Color c)
{
    int gray = (c.R + c.G + c.B) / 3;
    return Color.FromArgb(gray, gray, gray);
}

This works fine, but a better grayscale image can be achieved by using the color's luminance rather than the average of its three color components. Luminance favors a color's green component the most, followed by the red, then the blue. A generally accepted formula for computing luminance from an RGB value is: luminance = .299*r + .587*g + .114*b. I re-implement the ToGrayscale method using this new formula as follows:

private static Color ToGrayscale(Color c)
{
    int luminance = (int)
        (0.299 * c.R + 0.587 * c.G + 0.114 * c.B);
    return Color.FromArgb(luminance, luminance, luminance);
}

Colorizing the Image

The Colorize method that follows implements the previously described colorization algorithm using the ToGrayscale method.

internal class ImageManipulation
{
    public Bitmap Colorize(Bitmap original, Color selectedColor)
    {
        Bitmap newImage = new Bitmap(
            original.Width, original.Height);
        for(int y=0; y<original.Height; y++)
        {
            for(int x=0; x<original.Width; x++)
            {
                Color c = original.GetPixel(x, y);
          if (c != selectedColor) c = ToGrayscale(c);
          newImage.SetPixel(x, y, c);
            }
        }
        return newImage;
    }
    ...
}

Unfortunately, it provides relatively bad results, both in terms of efficiency and in terms of output quality. Therefore, I recommend a number of improvements.

Overhead Reduction

The Bitmap.GetPixel and Bitmap.SetPixel methods are relatively slow. For starters, every call to these methods does a bounds check to ensure that the coordinates provided are within the bounds of the image. This adds overhead: you are calling these methods over and over again with coordinates you know to be in bounds. Second, although Bitmap itself is a managed class, it is a wrapper around unmanaged GDI+ functions exposed by gdiplus.dll. As a result, every call to GetPixel and SetPixel results in a P/Invoke interop transition to the unmanaged world and back. Because these methods are called once for every pixel in the image, the overhead for this can be quite significant and very noticeable. When taking pictures, I try to use the highest resolution my camera is capable of, which typically results in over 7 million pixels per image. Thus, an overhead of more than 14 million interop calls is possible if we use our current algorithm to read colors from one image and write them to another. That's an unacceptable expense for an application of this nature.

There are several ways to address this. One approach is to let GDI+ handle the entire transformation using a ColorMatrix. A ColorMatrix defines a linear transformation that is applied to all pixels in the image simultaneously. This transformation is done with only a handful of interop transitions and desaturates an image in a fraction of the time required to iterate over every pixel using GetPixel and SetPixel.

ColorMatrix cm = new ColorMatrix(new float[][]{
    new float[]{.299f, .299f, .299f, 0, 0},
    new float[]{.587f, .587f, .587f, 0, 0},
    new float[]{.114f, .114f, .114f, 0, 0},
    new float[]{0, 0, 0, 1, 0},
    new float[]{0, 0, 0, 0, 0}});
using (ImageAttributes ia = new ImageAttributes())
{
    ia.SetColorMatrix(cm);
    using (Graphics g = Graphics.FromImage(colorizedImage))
    {
        g.DrawImage(original, new Rectangle(0, 0, 
                colorizedImage.Width, colorizedImage.Height),
            0, 0, original.Width, original.Height, 
            GraphicsUnit.Pixel, ia);
    }
}

Unfortunately, this converts the entire image to grayscale, rather than allowing select pixels to remain in color.

A better way to solve our problem is to directly access the data comprising the pixels in the image. Using a few GDI+ calls, we can retrieve the underlying memory buffer that stores the image data. This allows us to index directly into that buffer in order to retrieve and set the colors for individual pixels, an approach Eric Gunnerson uses in his article Unsafe Image Processing.

To encapsulate this approach, I created the FastBitmap class. The revised Colorize method below uses FastBitmap instead of directly using GetPixel and SetPixel, and as a result it yields dramatically improved performance with only a few minor code modifications:

public Bitmap Colorize(Bitmap original, Color selectedColor)
{
    int width=original.Width, height=original.Height;
    Bitmap newImage = new Bitmap(width, height);
    using(FastBitmap fastOriginal = new FastBitmap(original))
    using(FastBitmap fastNew = new FastBitmap(newImage))
    {
        for(int y=0; y<height; y++)
        {
            for(int x=0; x<width; x++)
            {
                Color c = fastOriginal[x, y];
          if (c != selectedColor) c = ToGrayscale(c);
          fastNew[x, y] = c;
            }
        }
    }
    return newImage;
}

Better Colorizing

As it stands, the output from the application is less than ideal. The algorithm is looking for one specific RGB color value, but in photographs large swatches of a color are rarely homogenous (for example, a large patch of green grass is almost certainly not a uniform shade of green). Ideally, our application should isolate and retain the set of colors similar in hue to the selected color. For example, if the selected color is a specific shade of green, it should be possible to isolate and retain other similar shades of green. Such comparisons are difficult to do in the RGB color space, but are easy to do in other color spaces, such as HSB (hue, saturation, brightness). The Hue portion of HSB represents a pure color, or in slightly more scientific terms, a specific position on the visible portion of the electromagnetic spectrum. Rather than comparing the RGB values of the target color with the color for each pixel in the image, we're better off comparing the hue values of those colors. A color's hue is easily accessible from the Color.GetHue method. (GetSaturation and GetBrightness methods are also available). For our application, this means that pixels should retain their color if their hue is within some epsilon value of the target pixel's hue—and that epsilon value should be user configurable.

public Bitmap Colorize(
    Bitmap original, Color selectedColor, int epsilon)
{
    int width=original.Width, height=original.Height;
    float selectedHue = selectedColor.GetHue();

    Bitmap newImage = new Bitmap(width, height);
    using(FastBitmap fastOriginal = new FastBitmap(original))
    using(FastBitmap fastNew = new FastBitmap(newImage))
    {
        for(int y=0; y<height; y++)
        {
            for(int x=0; x<width; x++)
            {
                Color c = fastOriginal[x, y];
                float pixelHue = c.GetHue();

                float distance = 
                    Math.Abs(pixelHue - selectedHue);
                if (distance > 180) distance = 360 - distance;

          if (distance > epsilon) c = ToGrayscale(c);
          fastNew[x, y] = c;
            }
        }
    }
    return newImage;
}

Our user interface enables users to broaden their color selection by sliding a hue epsilon track bar. (For reference, the images in Figure 2 and Figure 3 were colorized using an epsilon value of approximately 20).

Note A hue wheel is 360 degrees (such that the values 0 and 360 represent the exact same hue). If we pick two points on a wheel, there are two ways to measure the distance between the points, depending on which way we move around the wheel. The two distances are different, unless the points are exactly opposite from each other on the wheel, such that both distances equal 180 degrees. We always want to use the smaller of the two measurements, since the smaller measurement more accurately depicts the distance between the two values. So, we need to take the absolute value of one of the differences between the two hues. If that value is 180 degrees, it doesn't matter which measurement we use, since by definition they're both 180 degrees. If that value is less than 180 degrees, we know we are using the smaller of the two distances. If that value is greater than 180 degrees, we know we are using the longer distance. In that case, we convert the longer distance to the one we need by subtracting it from 360.
Example   Two hue values being compared are 359 and 2. The distance between them is computed as |359–2| = 357 (alternatively, distance is computed as |2–359| = 357, providing the same results). Since 357 is greater than 180, the distance is converted to 360–357 = 3. This is the value we expect, as there are 3 integral steps on the wheel between 359 and 2 (from 359 to 0, from 0 to 1, and from 1 to 2).

Additionally, for better functionality, users should be able to select multiple colors (for example, if they want to retain both the greens and the blues in the image). Our user interface enables users to hold down the SHIFT key as they click pixels in the image, thereby selecting a set of colors for color retention. These colors are stored in a List<Color>, which is then provided to a revised Colorize method that can handle multiple color selections rather than just one.

Partial Desaturation

Because some pixels' colors just miss the epsilon cutoff for retaining their color, we can improve our algorithm by desaturating those pixels only partially, easing the transition from color to black and white. We accomplish this by obtaining the HSB value for the RGB color, lowering the saturation (the S in the HSB), and converting the color back to RGB.

Note   While the Color class provides methods to retrieve the HSB values from an RGB color, it does not provide the reverse methods necessary to go from an HSB value to an RGB value. A reverse algorithm (largely based on code from Chris Jackson's blog at http://blogs.msdn.com/cjacks/archive/2006/04/12/575476.aspx) is available for download as part of the complete application.

Adding Ink Support

Ideally, a user should be able to select a particular region within the image for colorization, leaving everything outside of the region (or regions) black and white, even pixels outside of the regions containing the target hues. While a mouse could be used to select a region, why not take advantage of the Tablet PC's inking capabilities?

UI Changes Accommodating Ink

Tablet PC users should be able to use their tablet pen to select regions of an image for colorization. To support this, when the MainForm for the application loads, the application creates an InkOverlay and associates it with the PictureBox on the form:

private void MainForm_Load(object sender, EventArgs e)
{
    if (PlatformDetection.SupportsInk) InitializeInk();
    ...
}

Note This application supports computers that do not have the Tablet PC APIs available: before initializing ink support, the application checks whether the current platform supports ink by using the PlatformDetection class that I created for the MSDN article Microsoft Sudoku: Optimizing UMPC Applications for Touch and Ink.

Initializing Ink

The InitializeInk method used from MainForm_Load instantiates an InkOverlay control over the PictureBox. It configures the look and behavior of ink on the InkOverlay and registers event handlers to detect when new ink has been created:

[MethodImpl(MethodImplOptions.NoInlining)]
private void InitializeInk()
{
    _overlay = new InkOverlay(pbImage, true);
    _overlay.DefaultDrawingAttributes.Width = 1;
    _overlay.DefaultDrawingAttributes.Color = Color.Red;
    _overlay.DefaultDrawingAttributes.IgnorePressure = true;
    _overlay.Stroke += delegate { StartRefreshTimer(); };
    _overlay.NewPackets += delegate { tmRefresh.Stop(); };
}

Receiving Strokes

When the application receives a new Stroke, the same timer discussed earlier delays colorization. Thus, the user is free to input several strokes without triggering (and waiting for) image regeneration between each stroke.

_overlay.Stroke += delegate { StartRefreshTimer(); }; 
...
private void StartRefreshTimer()
{
    if (_originalImage != null &&
        _selectedPixels.Count > 0 && _lastEpsilon >= 0 &&
        !bwColorize.IsBusy)
    {
        btnLoadImage.Enabled = false;
        tmRefresh.Stop();
        tmRefresh.Start();
    }
}

When the overlay detects that new packets are being received (for example, a stroke is being drawn), the timer stops until the user completes the stroke.

_overlay.NewPackets += delegate { tmRefresh.Stop(); };

When the stroke is completed, the timer is restarted, and when timer expires, the event handler registered with the timer's Tick event calls the StartColorizeImage method:

private void tmRefresh_Tick(object sender, EventArgs e)
{
    StartColorizeImage();
}

StartColorizeImage calls the BackgroundWorker's RunAsync method, and RunAsync calls the ImageManipulation.Colorize method on a background thread to colorize the image.

Backend Changes Accommodating Regions

As currently implemented, the ImageManipulation class does not support regions, and thus it needs to be modified to accept the selection information from the UI. The InkOverlay provides a Strokes collection, where each Stroke in the Strokes collection represents a user-selected region. However, for a couple of reasons I don't want to pass the Strokes collection directly to the ImageManipulation class:

  • As a backend method, the ImageManipulation class should not be dependent on Tablet PC APIs that are inherently focused on UI.
  • Tablet PC APIs are not generic graphics APIs. They don't offer capabilities such as hit testing a pixel against a region to see whether the pixel is contained within the region. (They do have hit testing functionality against Stroke instances, for determining whether Stroke objects overlap certain points, but that won't help meet this application's needs.)

Converting To GraphicsPaths

Instead of basing it on the Tablet PC APIs, I've built my ImageManipulation class around GDI+ and the System.Drawing namespace. Specifically, the GraphicsPath class provides all of the functionality in which I'm interested. A GraphicsPath can be created from a sequence of points in order to create a polygon representing a region. Methods on GraphicsPath can then be used to test whether a particular point is contained within the region.

Unfortunately, since the UI is using a Stroke to represent a selected region and the backend is using a GraphicsPath for the same purpose, I now need to be able to convert between the two. My InkToGraphicsPaths method performs that conversion.

private List<GraphicsPath> InkToGraphicsPaths()
{
    Renderer renderer = _overlay.Renderer;
    Strokes strokes = _overlay.Ink.Strokes;

    if (strokes.Count > 0)
    {
        using (Graphics g = this.CreateGraphics())
        {
            List<GraphicsPath> paths = 
                new List<GraphicsPath>(strokes.Count);
            foreach (Stroke stroke in strokes)
            {
                Point[] points = stroke.GetPoints();
                for (int i = 0; i < points.Length; i++)
                {
                    renderer.InkSpaceToPixel(g, ref points[i]);
                    ...
                }
                GraphicsPath path = new GraphicsPath();
                path.AddPolygon(points);
                path.CloseFigure();
                paths.Add(path);
            }
            return paths;
        }
    }
    return null;
}

The InkToGraphicsPath method loops through the Strokes collection provided by the InkOverlay class. The application retrieves the Point values that make up each Stroke by using the Stroke.GetPoints method. Each Point value is converted from ink-space coordinates to pixel-space coordinates by using the Renderer object associated with the InkOverlay along with a Graphics object derived from the MainForm.

Once the array of Point values for the Stroke is converted to pixel-space, I instantiate a GraphicsPath and use its AddPolygon method to create a path from the supplied points. The GraphicsPath's CloseFigure method completes the polygon, connecting the last point back to the first. This instance of the GraphicsPath is added to a List<GraphicsPath> of paths returned to the caller, where each path in the list represents a Stroke in the original Strokes collection.

Image Scaling

As mentioned earlier, our application's PictureBox stretches the target image to match the size of the PictureBox. If the target image gets stretched, the ink drawn by the user is inherently stretched by the inverse scaling factor applied to the original image (for example, if the PictureBox displays the image at quarter size, the Point values returned by Stroke.GetPoints should have their x and y coordinates multiplied by a factor of 4). To account for this, the InkToGraphicsPath method compares the size of the original image with its current size, computing scaling factors for its x and y dimensions.

float scaleX = _originalImage.Width / (float)pbImage.Width;
float scaleY = _originalImage.Height / (float)pbImage.Height;

These scaling factors are used to modify each Point comprising a Stroke after the Point is converted from ink-space to pixel-space:

renderer.InkSpaceToPixel(g, ref points[i]);
if (scalePath)
{
    points[i] = new Point(
        (int)(scaleX * points[i].X),
        (int)(scaleY * points[i].Y));
}

Colorizing Regions

The Colorize method described earlier must now incorporate ink-based selection of regions into its colorization algorithm. If a pixel lies outside of all selected regions, it should be converted to black and white. If a pixel lies within a selected region, its hue should be compared to the user-selected hue. As such, our algorithm must first determine whether each pixel in the image is contained in a GraphicsPath representing a selected region. This is easily accomplished using the GraphicsPath.IsVisible method:

pixelInSelectedRegion = false;
Point p = new Point(x,y);
foreach(GraphicsPath path in paths)
{
    if (path.IsVisible(p))
    {
        pixelInSelectedRegion = true;
        break;
    }
}

GraphicsPath.IsVisible accepts a Point (created using the current x and y loop variables), returning true if the Point lies within the GraphicsPath region, and false otherwise. After looping through all of the GraphicPaths in the List<GraphicsPath> paths list and calling IsVisible on each, our algorithm determines whether the pixel is in any of the user-defined regions.

Figure 5. Isolating a region

As currently implemented, however, our algorithm is incredibly slow. Its performance declines significantly, both as image size increases and as the user selects additional regions. There are two primary reasons for this:

  • The algorithm determining whether a point lies within an arbitrary polygon checks each vertex of the polygon, and a GraphicsPath created from a Stroke may contain hundreds of vertices, if not more.
  • As mentioned earlier, the System.Drawing namespace is implemented as P/Invoke wrappers around the native GDI+ implementation in Windows, and every call to GraphicsPath.IsVisible results in a P/Invoke call to unmanaged code. For an image with several million pixels, all of which need to be checked for region inclusion, that results in several million interop calls.

Reducing Overhead

If the Colorize method can quickly determine that a point does not lie within a particular GraphicsPath, then it needn't call the high-overhead GraphicsPath.IsVisible method. Why not use a bounding box (the smallest rectangular region that completely surrounds all elements of the GraphicsPath) to approximate the GraphicsPath? Hit testing against a rectangle is fast, because it involves only a handful of comparisons and additions. Moreover, although we could code hit testing ourselves, System.Drawing.Rectangle does it for us with code similar to the following:

public struct Rectangle
{
    public int X, Y, Width, Height;

    public bool Contains(int x, int y)
    {
        return x >= this.X && 
               y >= this.Y && 
               x < this.X + this.Width && 
               y < this.Y + this.Height;
    }

    public bool Contains(Point p) { return Contains(p.X, p.Y); }
    ...
}

If the bounding Rectangle for a particular GraphicsPath contains the pixel, we follow-up with the call to IsVisible. However, if the selected regions are relatively small, most of the pixels in the image will fall outside the bounding rectangles. If the pixel is outside of the bounding rectangle, it's definitely outside of the GraphicsPath, and as such we don't need to call IsVisible. This avoids many unnecessary interop operations.

Taking this a step further, for a particular GraphicsPath, the bounding rectangle won't change. Rather than getting the bounding Rectangle each time a pixel is examined (that would be even slower than just calling IsVisible), we can compute and store bounding Rectangles for all GraphicsPath instances at the beginning of the Colorize method:

Rectangle [] pathsBounds = null;
if (paths != null && paths.Count > 0) 
{
    pathsBounds = new Rectangle[paths.Count];
    for(int i=0; i<pathsBounds.Length; i++)
    {
        pathsBounds[i] = Rectangle.Ceiling(paths[i].GetBounds());
    }
}
Point p = new Point(x, y);
for(int i=0; i<paths.Count; i++)
{
    GraphicsPath path = paths[i];
    if (pathsBounds[i].Contains(p) && path.IsVisible(p))
    {
        pixelInSelectedRegion = true;
        break;
    }
}

This makes a huge difference, especially if the size of the selected regions is relatively small when compared to the size of the image, as most of the image's pixels will fall outside of the bounding rectangles. Unfortunately, if the bounding rectangles are big, we don't get much savings from this, as we'll still need to call GraphicsPath.IsVisible for a large number of pixels. To improve upon this, we can address the second performance bottleneck: that of needing P/Invokes to unmanaged GDI+.

A Better Selection Algorithm

We know that hit testing against a rectangle is fast. Why not approximate a GraphicsPath region with a bunch of rectangles, avoiding any use of unmanaged code? For this task, we can take advantage of the Region class and its GetRegionScans method. We can then use the GraphicsPath as an input to the Region's constructor, and the GetRegionScans method to retrieve an array of RectangleF values approximating the region. For example, the GraphicsPath shown in Figure 6 was approximated using the 199 (randomly colored) rectangles in Figure 7.

Figure 6. Sample GraphicsPath created from a Stroke

Figure 7. Approximation of GraphicsPath using rectangles

As with the bounding boxes, before looping through all of the pixels in the image we can retrieve rectangle-based approximations for each GraphicsPath:

List<RectangleF[]> compositions = null;
if (paths != null && paths.Count > 0)
{
    compositions = new List<RectangleF[]>(paths.Count);
    using (Matrix m = new Matrix())
    {
        for(int i=0; i<paths.Count; i++)
        {
            using (Region r = new Region(paths[i])) 
            {
                compositions.Add(r.GetRegionScans(m));
            }
        }
    }
}

With the rectangular approximations computed, we can use them to determine whether the pixel lies within the GraphicsPath. Notice we now no longer have a need for IsVisible.

Point p = new Point(x, y);
for (int i = 0; 
     i < pathsBounds.Length && !pixelInSelectedRegion; 
     ++i)
{
    if (pathsBounds[i].Contains(p))
    {
        foreach (RectangleF bound in compositions[i])
        {
            if (bound.Contains(x, y))
            {
                pixelInSelectedRegion = true;
                break;
            }
        }
    }
}

In my tests, the code using rectangular approximation is over 30 times faster than the code using IsVisible. And if clever data structures are used to limit the number of rectangles examined, more gains in processing speed are possible. However, with this approach accuracy may be compromised, because the rectangles returned from GetRegionScans are truly an approximation. As a result, pixel transformations near the GraphicsPath may not be 100 percent accurate, as shown in Figure 8 (note the pixels close to the red line). For my purposes, this is an acceptable tradeoff.

Figure 8. GetRegionScans returns only an approximation

Conclusions

There are many additional features you could add to this application, including ones that make use of the Tablet PC APIs. You could also port the core image manipulation code to an add-in for existing imaging applications. I look forward to seeing what you come up with. Enjoy!

 

About the author

Stephen Toub is a Technical Lead on the MSDN team at Microsoft. He is the Technical Editor for MSDN Magazine and is the author of its .NET Matters column.

Show:
© 2014 Microsoft