Reprinted with permission from Visual Basic Programmer's Journal, July 2001, Volume 11, Issue 7, Copyright 2001, Fawcette Technical Publications, Palo Alto, CA, USA. To subscribe, call 1-800-848-5523, 650-833-7100, visit www.vbpj.com, or visit The Development Exchange.
Getting StartedManipulate Strings Faster in VB.NET VB.NET offers two objects to help you with your string-handling needs. Take a look at how they measure up. by Carl Franklin |
| I like to do a few benchmark tests when first learning a new development environment to see how the new stuff stacks up against my platform of choice. I found something that might surprise you: VB.NET, although slower than VB6 where Windows Forms are concerned, can do a few things much faster than VB6, such as string manipulation. In this article, I'll introduce and compare two string-related objects in the .NET Framework: String, the standard string object akin to VB6's String object, and .NET's new StringBuilder object.
I'll first refresh your memory on string manipulation in VB6. You assign a string like this, as you did in previous versions: VB6 creates a new copy of the string each time you assign it to something. In this example, VB6 creates the string three times: once on the Dim statement, once on the second (Hello) statement, and again on the third statement, which appends the word "there" and returns a completely new string. The fact that VB makes multiple copies of your string variables might not seem like a big deal with such a small example, but consider a program, for example, that manipulates large e-mail message strings. If the average size of an e-mail message is 1 MB, your program takes up another 1 MB of memory every time it appends to an e-mail message. You might wonder, "What happens to all the dead strings when they're not in use?" Good question. A process called garbage collection occurs periodically, where VB goes through string memory and releases memory taken by strings no longer in use. The garbage collection process is built into the .NET Framework and occurs system-wide, no matter which language you write in. Even if you use the standard String class in VB.NET, you still face the problem of copying string data constantly, which slows everything down. Here's the same code written in VB.NET: The String class's Concat method concatenates two strings together. It looks a little different from the previous VB6 code snippet, but the end result is the same. VB.NET copies the original string, returns the copy, and marks the original string for garbage collection, sort of like bringing the trashcan to the corner for pickup. A Whole New String Game [StringBuilder] is convenient for situations in which it is desirable to modify a string, perhaps by removing, replacing, or inserting characters, without creating a new string subsequent to each modification. The methods contained within this class do not return a new StringBuilder object unless specified otherwise. So you should create and manipulate your strings as StringBuilder objects, then convert them to a string using the built-in ToString method if you need a String object. The StringBuilder resembles a String object, but it's not. Here's the same syntax using the StringBuilder class: Note that you can't simply do this: This code tries to assign a String object to a StringBuilder object, which VB.NET does not allow. This is just an example of how tightly typecast VB.NET is. You can turn off VB.NET's strict typecasting by placing Option Strict Off at the top of your modules (see the help file for details). The difference between using the String and StringBuilder classes: StringBuilder's Append method works on the same data in memory and doesn't make a copy. Again, when you access large stringssuch as XML recordsets, encoded files, or encrypted filescopying strings could present a performance problem. Let's see exactly how these two classes compare with a little benchmark program.
The sample project, StringReplace, creates a long string of numbers separated by tab characters and replaces all the tabs with spaces (see Figure 1). The app times the string creation and replacement processes for both the String class (see Listing 1) and StringBuilder class (see Listing 2). Re-create the VB.NET project by creating a new Windows Application and adding instructional labels if desired (they're not required). Add a textbox to the form, name it txtCount, and set the Text property to 10000. Add two buttons to the form: Name the first button btnStringBuilder and set its Text property to StringBuilder; name the second button btnString and set its Text property to String. Add a label next to btnStringBuilder and name it lblCreate (Create Time); add a label next to btnString and name it lblModify (Modify Time). Type the code from Listing 1 and Listing 2 into btnString's and btnStringBuilder's Click event handlers, respectively. You might notice that controls placed on the form through cut-and-paste have their Visible property set to False by default, rendering them invisible. This is a bug in beta 1. Go From Zero to String in Seconds So, let me tell you what's going on in this code. Note this line (see Listing 1): I could've said this just as easily: Microsoft.VisualBasic is already in scope, and DateInterval belongs to Microsoft.VisualBasic, so you don't have to prefix DateInterval with Microsoft.VisualBasic. But I want you to see that you can't just pull objects out of thin airthey have to be in scope. I'll talk more about this in another article. The first line of code after the Dim statements sets the value of the Interval variable: DateInterval.Second is a constant you access as a property of the DateInterval class. That's a little different from what you're used to, but it'll get easier to understand the more code you write. Next, set the Max variable to the value the user specifies in the txtCount control: Max is an Integer, and an Integer in VB.NET is equivalent to a Long in VB6: 32 bits, or 4 bytes. In VB.NET, a Long is actually a 64-bit integer value. Again, notice you must strongly cast Max as an Integer. The next line of code instantiates the StartTime variable and sets its value to the current time by accessing the Date class's Now property: That's a great feature of VB.NET: You can assign a value to a variable at the same time you create it. The next block of code goes through a loop, adding the number (i) to the string each time as it increments i: Each time through the loop, the variable AddThis contains the number in string form followed by a tab character (ASCII value of 9). Then the Concat method adds AddThis to MyString, a variable made from the String class, and returns a new string into the existing variable space. In VB6, the loop would look like this: Next, the code calculates the difference between the current time and start time using the DateDiff function and displays it in the label named lblCreate: DateDiff takes an interval, which you've initialized to "seconds," and two dates. It returns the difference. Make sure you pass the later time last, or it will return a negative number. Use the same timing technique for the next section, which replaces each tab in the string with a space: The resulting time is displayed in the lblModify label. Now look at the same process, but use the StringBuilder class instead of the String class (see Listing 2). For time's sake (yours as well as mine), I'll tell you only what's different. First note the creation of the StringBuilder object: The StringBuilder class belongs to the System.Text namespace, so you must define the entire class name specifically. Here's the code that appends the string with each number in a loop: The StringBuilder class has an Append property that adds a string to the existing string. Note that you don't have to say this: You could, but because Append doesn't return a new string, this would be pointless. It's the same with the replace code: You don't have to return a new string because StringBuilder works on a single instance of the string data. Cool! Finally, to return a real String object, call the ToString method: You must do this if you want to access the string that StringBuilder creates. Carl Franklin is a VB.NET specialist, developer, and trainer. He is the MSDN regional director for Hartford, Conn., a trainer with Deep Training (www.deeptraining.com), and an author for John Wiley & Sons. His company, franklins.NET (www.franklins.net), provides .NET Web hosting and co-location, training, and mentoring. E-mail Carl at carl@franklins.net. |
