Easily Write Custom Gesture Recognizers for Your Tablet PC Applications

 

Scott Swigart
Swigart Consulting, LLC

November 2005

Applies to:
   Microsoft Windows XP Tablet PC Edition Platform SDK 1.7

Summary: Learn how to easily write a custom gesture recognizer with the Simple Gesture Recognition library. Readers should be familiar with the Microsoft Windows XP Tablet PC Edition Platform SDK 1.7. (15 printed pages)

Contents

Overview
Siger - At a Glance
   Siger Project
   SigerUnitTests Project
   InkTest Project
Building Custom Recognizers
   Vector Strings and Regular Expressions
   The Gesture, the Whole Gesture, and Nothing But the Gesture
   Comparing Statistics
   Open and Shut
   Plugging It In
   Velocity Matters
   Flip-Flop
Supported Languages
Gesture Usability
Conclusion
Additional Resources
Biography

Overview

Most applications are designed for interaction through a mouse or keyboard. However, when designing an application for Tablet PC, it's critical that you keep pen input and pen interaction at the forefront of your thinking. Though the pen provides direct interaction with the screen, traditional user interface metaphors, such as toolbar buttons and scroll bars, are typically small and sometimes difficult to target with the pen.

For this reason, Tablet PC supports gestures. With gestures, specific pen motions can initiate commands. For example, a quick up or down flick of the pen may cause a document to scroll in the indicated direction. A check-mark gesture may select the nearest item, rather than forcing the user to accurately point at and tap in a tiny check box.

The Tablet PC SDK ships with the ability to recognize a number of pre-defined application gestures. However, you may want to develop custom gestures that are specific to your application. For example, you might want a question mark gesture to display online Help, or a bracket gesture to group selected application items together.

Figure 1. Possible custom gestures

I wanted to construct a library that made it extremely simple to add custom gestures to an application. Ideally, a new gesture would only require a few lines of code, and not require the developer to use any complex algorithms or do any image processing.

The result is Siger—the Simple Gesture Recognition library. As you will see, with this library you can use simple regular expressions as the basis of your gesture recognition. The Siger solution is available as an open-source project from SourceForge.net. The solution is under the BSD license, so you are free to use and distribute it even in closed-source, commercial applications.

The first part of this article will orient you to the Siger project and show you what it includes. The second part of this article will walk you through building custom gesture recognizers for two gestures, the question mark and right bracket, equipping you with the information you need to build your own custom gesture recognizer using Siger.

Siger - At a Glance

The latest Siger source can be downloaded as a .Zip file from SourceForge.net. When you extract the .Zip file, you will find a root solution file called Siger.sln. When you open the solution file, you will see three projects, SiGeR, SigerUnitTests, and InkTest.

Figure 2. Siger solution structure

Siger Project

The main recognition engine is in the Siger project. This project contains the classes needed to break down an ink stroke, perform analysis, and apply various recognizers to it. To ensure the robustness of the Siger engine, I built sample recognizers that mimic the functionality of some of the Tablet PC built-in recognizers. Specifically, Siger can recognize the check mark, circle, curlicue, double-circle, double-curlicue, square, star, and triangle gestures. In addition, Siger includes a few gestures that are not part of the Tablet PC SDK: the rectangle, question mark, and right-bracket gestures.

SigerUnitTests Project

The unit testing project runs pre-saved ink strokes through the recognizer engine, checking two important pieces of functionality. First, it ensures that the recognizer correctly detects all shapes of a particular type (for example, the triangle recognizer returns "true" for all the triangle shapes). Second, and just as important, the unit tests ensure that the recognizer returns no false positives. For example, it ensures that the star recognizer doesn't also return "true" for a triangle gesture. In the development of this project, the unit testing was critically important as more and more recognizers were created and the code was refactored. Unit testing made it instantly apparent when a recognizer was not correctly detecting shapes and returned false positives for new shapes. Unit testing also made it apparent when refactoring broke previously working code.

Note   The unit tests are currently built for use with NUnit, a unit testing framework for all .NET languages. If you want to run the unit tests, you will need to download and install NUnit from www.NUnit.org. However, NUnit is not needed to build recognizers using the Siger engine or to use the InkTest project. in addition, with the release of Visual Studio 2005 Team System, the Siger unit tests will likely be converted to use the Team System unit testing functionality.

InkTest Project

When building custom gestures, it's useful to have a test bench where you can test gestures and determine whether they are correctly recognized. The InkTest project is such a test bench.

Figure 3. Simple Gesture Recognition Test Bench

With InkTest, you can do a number of things. First, when you test a gesture in the top-left pane, the application will attempt to recognize it using the existing Siger recognizers. For comparison, you can make gestures in the lower-left pane and the application will attempt to recognize them using the Tablet PC built-in recognizers.

When a gesture is made, statistical information about the gesture appears in the properties pane at the bottom right corner. When creating a recognizer for a new gesture, this statistical information is useful in determining your recognition strategy (in other words, figuring out what makes this gesture different from other gestures). This statistical information will be covered in detail later in this article.

In addition, with InkTest, you can load and save gestures from the File menu. This is useful when you've made a gesture that should be recognized, but for some reason is not recognized. You can work on your code and just keep re-loading the gesture until your recognizer is working correctly. You can also use saved gestures as part of the unit testing suite.

Building Custom Recognizers

Now that you know what functionality Siger makes available to you, you can examine two custom recognizers and see how they use specific gesture characteristics to aid in accurately recognizing each gesture.

Vector Strings and Regular Expressions

Assume that you want to create an application gesture that will display online Help when the user makes a question mark motion. This is not one of the gestures natively recognized by the Tablet PC, so you must build a custom recognizer. You will see that this is quite simple to do.

Writing a recognizer for the actual question mark shape bitmap would be relatively complex. Generally, this involves sending the pixels into a neural network that must be trained to recognize them. This process makes the development of custom gestures overly complex, so Siger takes different approach, by generating a vector string.

When a user makes a question mark stroke or gesture, the pen moves along a certain well-known path:

Figure 4. Question mark pen path

The pen starts off moving generally up (maybe up and to the right, maybe up and to the left, but generally up), then the pen moves over to the right, then down, and then to the left, and finally down again. To get at this directional information, Siger converts the points of the stroke into a series of vectors. Because a regular expression will eventually be used to match the stroke, the vectors are just encoded as a simple string. The following listing is an example of the question mark gesture vector string:

Listing 1. Vector string

LU,U,U,U,U,U,RU,RU,RU,RU,RU,RU,R,R,R,R,R,R,R,R,RD,RD,RD,
RD,D,D,D,D,LD,LD,LD,LD,L,L,LD,LD,LD,D,D,D,D,D,D,D,D,D,L

Table 1. Key to vector string

Vector Vector direction
L Left
R Right
U Up
D Down

In this string, you can see that the first vector in the stroke indicates left-up, followed by a series of vectors indicating up, then right (including right-up, right, and right-down), then down, then left-down, and then down. While recognition of the pixel image would be difficult, recognition of this pattern of strings is easy.

To generate the vector string, create an instance of the Siger.StrokeInfo class, and pass a Tablet PC Stroke object to the constructor. StrokeInfo decomposes the ink stroke into a vector string. The StrokeInfo class also exposes an IsMatch function that you can use to determine if the stroke matches a given regular expression pattern. The code to match a question mark is as follows:

Listing 2. Code to match a question mark

Dim strokeInfo as New StrokeInfo(InkPicture1.Ink.Strokes(0))

If strokeInfo.IsMatch(Vectors.Ups & Vectors.Rights & _
    Vectors.Downs & Vectors.Lefts & Vectors.Downs) Then

    MessageBox.Show("Question Mark")

End If

The Vectors class simplifies the process of building the regular expressions. Generally, you are only concerned about matching eight possible pen directions: right, up, left, down, right-up, right-down, left-up, and left-down. Vectors.Ups will match anything that is generally up, so it will match up, right-up, and left-up. Using the Vectors class makes it quick and easy to build the regular expression and makes your code readable. However, if the Vectors class limits you in any way, you can create any arbitrarily complex regular expression and pass it directly to the IsMatch function.

The Gesture, the Whole Gesture, and Nothing But the Gesture

The previous code will work, but it's not entirely robust. It will match a question mark pattern anywhere in the stroke, so even the following stroke would be matched as a question mark gesture:

Figure 5. Gesture that shouldn't match, but does

Therefore, it is important to match the entire vector string, not just a portion of it. This, however, introduces another problem. If you look closely at the question mark in Figure 4, you notice that at the bottom of the question mark, the line jags suddenly to the left. It is common for the pen to make little tick marks like this when the pen touches down, or when the pen is lifted up. These ticks must be ignored to correctly recognize the stroke. To facilitate this, a couple of regular expression fragments are available from the Vectors class that match the beginning and the end of the stroke, and trim off any tick marks. With them, the following code does a much better job of matching a question mark.

Listing 3. Better question mark matching

Dim strokeInfo as New StrokeInfo(InkPicture1.Ink.Strokes(0))

If strokeInfo.IsMatch(Vectors.StartTick & Vectors.Ups & _
    Vectors.Rights & Vectors.Downs & Vectors.Lefts & _
    Vectors.Downs & Vectors.EndTick) Then

    MessageBox.Show("Question Mark")

End If

Notice the inclusion of Vectors.StartTick and Vectors.EndTick, which match the beginning and end of the vector string and trim off the ticks if present.

Comparing Statistics

In some cases, it's difficult to accurately recognize a gesture using just a regular expression. Consider a square and a circle. For both, the pen moves generally right, down, left, and up. However, when a circle is parsed into the vector string, the vector string will contain many more diagonal vectors (such as right-down, left-up, and so on) than a square would. This is where statistical information, mentioned previously, can improve recognition. The StrokeInfo class exposes additional properties that let you quickly examine these statistical characteristics of the stroke:

Table 2. Properties exposed by the StrokeInfo class

Property Description
Right Percentage of the stroke that consists of the pen moving to the right.
Left Percentage of the stroke that consists of the pen moving to the left.
Up Percentage of the stroke that consists of the pen moving up.
Down Percentage of the stroke that consists of the pen moving down.
RightUp Percentage of the stroke that consists of the pen moving diagonally up and to the right.
RightDown Percentage of the stroke that consists of the pen moving diagonally down and to the right.
LeftUp Percentage of the stroke that consists of the pen moving diagonally up and to the left.
LeftDown Percentage of the stroke that consists of the pen moving diagonally down and to the left.
Straight Percentage of the stroke that is straight (that is, the sum of Right, Left, Up, and Down).
Diagonal Percentage of the stroke that is made of diagonal lines (that is, the sum of RightUp, RightDown, LeftUp, and LeftDown).
StartEndProximity Distance between the start point of the stroke and the end point of the stroke. Used to determine if the stroke is a closed shape, like a circle, or an open shape, like a check mark.
StopPoints Essentially, the number of corners in the shape, counting the start point and end point as corners.

While a circle and square may both be matched by the same regular expression, a square should have a high value for the Straight property. For a circle, on the other hand, both the Straight and Diagonal properties should be roughly fifty percent.

Open and Shut

Circles and squares are closed shapes, while question marks and brackets are open shapes. It is easy to determine whether a stroke is closed or open, and this characteristic is another key to achieving accurate gesture recognition. The StartEndProximity property provides the distance between the stroke start point and the stroke end point. If it falls below a certain threshold, which is defined by the CLOSED_PROXIMITY constant, then the shape is considered closed.

Listing 4. Ensuring that the stroke is an open shape

If StrokeInfo.StrokeStatistics.StartEndProximity > CLOSED_PROXIMITY _
    AndAlso StrokeInfo.IsMatch(Vectors.StartTick & Vectors.Ups & _
        Vectors.Rights & Vectors.Downs & Vectors.Lefts & _
        Vectors.Downs & Vectors.EndTick) Then

    MessageBox.Show("Question Mark")

End If

If StartEndProximity is greater than the constant CLOSED_PROXIMITY, then the shape is considered open. Notice that the code checks for a closed shape before performing the regular expression match. This is intentional, because checking for the closed shape is fast compared to the regular expression match. Also note that "AndAlso" is used to make this a short circuit Boolean evaluation (in C#, this would not be needed, as all Boolean logic is short circuit). In general, do quick comparisons first and use short circuit logic so that the recognizer spends as little time as possible when rejecting a stroke.

Plugging It In

To determine just which gesture the user made, you will often want to test the stroke with a number of recognizers. Siger contains a number of classes to make this simple. The first step is to create a Recognizer class that inherits the CustomRecognizer base class.

Listing 5. Creating a custom recognizer class

Public Class Question
    Inherits CustomGesture

    Public Sub New()
        MyClass.New(Nothing)
    End Sub

    Public Sub New(ByVal strokeInfo As StrokeInfo)
        MyBase.New(strokeInfo)
        Name = "Question"
    End Sub

    Protected Overrides Function Recognize() As Boolean
        Return _
            StrokeInfo.StrokeStatistics.StartEndProximity > CLOSED_PROXIMITY _
            AndAlso StrokeInfo.IsMatch(Vectors.StartTick & Vectors.Ups & _
                Vectors.Rights & Vectors.Downs & Vectors.Lefts & _
                Vectors.Downs & Vectors.EndTick)
    End Function

End Class

You can see that not much is involved in creating a custom recognizer class. It simply contains two constructors, one which contains no arguments, and another which can be initialized with a StrokeInfo class. You also implement a Recognize function where you put your custom recognition logic. Write this function to return "true" if it recognizes the stroke.

To use multiple recognizers against a stroke, create an instance of the SigerRecognizer and load it up with each recognizer, as shown in the following code example:

Listing 6. Initializing the SigerRecognizer for specific shapes

Dim customReco As New SigerRecognizer
customReco.RecognizerList.Add(New ScratchOut)
customReco.RecognizerList.Add(New Triangle)
customReco.RecognizerList.Add(New Star)
customReco.RecognizerList.Add(New Question)

At the point when the user makes a stroke, simply pass it to the SigerRecognizer and see which custom recognizer indicates a match. For the InkTest application, the matches are shown in a text box.

Listing 7. Looking for a match

Dim stroke As Stroke = inkPict.Ink.Strokes(0)
Dim hits() as CustomGesture = customReco.Recognize(stroke)

For Each gesture As CustomGesture In hits
    TextBox1.Text &= gesture.Name & vbCrLf
Next

Velocity Matters

Now that you've examined the completed recognizer for the question mark gesture, you'll see that building a recognizer for a bracket is slightly more complex. Consider the gestures in Figure 6. Starting from the top, both gestures move generally to the right, then down, and then left. However, one is clearly a bracket, and the other is an arc.

Figure 6. Bracket and arc

You can use a regular expression to start to recognize the bracket, but you must look at additional information to differentiate it from other shapes. As previously discussed, the StrokeInfo class exposes Straight and Diagonal properties which are beneficial here. Less than ten percent of the vectors in the bracket would be diagonal, but in the arc more than ten percent of the vectors would be diagonal.

There's one other non-visible aspect of the stroke that can aid in recognition. To draw or gesture a corner, the user must slow the pen motion. If you attempt to draw or gesture a corner without letting the pen slow down, you end up with an arc. Figure 7 shows graphs of the pen velocity for the bracket and arc:

Figure 7. Pen velocity graphs for the bracket and arc respectively

When making the bracket gesture, the user moves the pen from the starting point and starts accelerating. However, when it is time to turn the first corner, the pen motion slows dramatically. It speeds away from the first corner, and then screeches almost to a stop as it makes the second corner. It speeds up again, and then slows as the user completes the bracket.

The pen velocity for the arc looks very different. The user continues to accelerate the pen motion all the way around the arc, and then slows just before the end.

Through the StrokeInfo.StopPoints property, you can determine how many times the pen slowed down. Counting the beginning and the end of the stroke and the two corners in the middle, the bracket should have four stop points.

Note   The Tablet PC SDK will also let you identify the "cusps" of a stroke. A cusp is the "point on the stroke where the direction of writing changes in a discontinuous fashion." However, I haven't found cusps to be as predictive as the velocity signature, but it's possible that a combination of the two, or a variation on cusps, would yield even higher accuracy.

Flip-Flop

A user will most likely make the bracket gesture starting from the top left. However, as shown in Figure 8, a bracket gesture can also be made starting at the bottom right.

Figure 8. Both are valid brackets

You wouldn't want to write custom matching code for every direction a shape could conceivably be drawn, so the base CustomGesture class can automatically rotate and flip the stroke while looking for a match.

At this point, the analysis of the bracket has been done, and the recognizer can be coded. The recognition logic is just one (somewhat long) line of code:

Listing 8. RightBracket recognizer

Protected Overrides Function Recognize() As Boolean
    Return _
        StrokeInfo.StrokeStatistics.StartEndProximity > CLOSED_PROXIMITY _
        AndAlso StrokeInfo.StrokeStatistics.StopPoints = 4 _
        AndAlso StrokeInfo.StrokeStatistics.Square > 0.9 _
        AndAlso StrokeInfo.IsMatch(Vectors.StartTick & Vectors.Rights & _
            Vectors.Downs & Vectors.Lefts & Vectors.EndTick, _
            0, False, True)
End Function

The IsMatch function looks for a stroke that moves right, down, and left. If the stroke doesn't match, the additional arguments to the IsMatch function indicate that the Y-coordinates should be flipped, and the match tried again. This handles the two ways in which the user could make the bracket gesture.

The bracket is not a closed shape, and the StartEndProximity property checks this. If the stroke is a bracket, it should have four stop points. Also, the bracket should have a low percentage of diagonal lines. As you can see in Figure 9, this code correctly recognizes the right bracket.

Figure 9. Right bracket is recognized

Supported Languages

There's one more topic worth discussing. The Siger engine is written in Microsoft Visual Basic .NET, but you are free to write recognizers in any .NET language you want. Simply add a reference to the Siger project or assembly, and create a class that inherits CustomRecognizer. Here's the right-bracket recognizer implemented in C#.

Listing 9. RightBracket recognizer in C#

public class CSRightBracket : CustomGesture
{
    public CSRightBracket() : this(null) {}

    public CSRightBracket(StrokeInfo si) : base(si)
    {
        Name = "CS Right Bracket";
    }

    protected override bool Recognize()
    {
        return (
            StrokeInfo.StrokeStatistics.StartEndProximity > CLOSED_PROXIMITY 
            && StrokeInfo.StrokeStatistics.StopPoints == 4 
            && StrokeInfo.StrokeStatistics.Square > 0.9 
            && StrokeInfo.IsMatch(Vectors.StartTick + Vectors.Rights +
            Vectors.Downs + Vectors.Lefts + Vectors.EndTick, 
            0, false, true));
    }
}

Gesture Usability

Having a library for recognition is but one aspect of integrating gestures into your application. If you are going to implement custom gestures, you must consider how you'll make them discoverable to your users. There's little point in implementing gestures if your users never know they exist.

For the user, it can take a little practice to make accurate gestures. Having the gesture visible may make it easier for the user to realize how to accurately draw your custom shape.

Finally, applications that respond to gestures also likely allow ink input. How will your application differentiate between handwriting (that is, ink input) or gesturing? Some applications require the user to hold down the barrel button while gesturing. Others may provide a specific area for gestures. Others may even require that the user explicitly enter a "gesture mode."

For more thoughts on gesture usability, see Using Gestures in Tablet PC Applications by Mark Hopkins.

Conclusion

In this article, you have seen how you can use Siger for custom gesture recognition. The basis for recognition is a simple regular expression. Siger provides additional statistics about the ink stroke, which can improve the recognition accuracy. In most cases, you need only write a few lines of code to recognize a gesture. Again, recognizing gestures is only part of the problem. You need to ensure that they are incorporated in a usable and discoverable way.

Feel free to download Siger, use it in any way you want, and freely distribute it as part of your Tablet PC applications. You can also post suggestions or bugs to the SourceForge project page.

Additional Resources

The following links were mentioned in this article.

Biography

Scott Swigart, owner of Swigart Consulting LLC, spends his time helping organizations get the most out of today's technology, while preparing to leverage tomorrow's. In addition to consulting and training, Scott Swigart has authored many articles and books about .NET development. Feel free to contact Scott with any questions or comments at scott@swigartconsulting.com.