Export (0) Print
Expand All

Reprinted with permission from Visual Basic Programmer's Journal, July 2001, Volume 11, Issue 7, Copyright 2001, Fawcette Technical Publications, Palo Alto, CA, USA. To subscribe, call 1-800-848-5523, 650-833-7100, visit www.vbpj.com, or visit The Development Exchange.

Visual Studio 6.0
This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

July 2001

Getting Started

Manipulate Strings Faster in VB.NET

VB.NET offers two objects to help you with your string-handling needs. Take a look at how they measure up.

by Carl Franklin

www.vbpj.comThe Development Exchange

I like to do a few benchmark tests when first learning a new development environment to see how the new stuff stacks up against my platform of choice. I found something that might surprise you: VB.NET, although slower than VB6 where Windows Forms are concerned, can do a few things much faster than VB6, such as string manipulation. In this article, I'll introduce and compare two string-related objects in the .NET Framework: String, the standard string object akin to VB6's String object, and .NET's new StringBuilder object.

What you need:

VB.NET

I'll first refresh your memory on string manipulation in VB6. You assign a string like this, as you did in previous versions:

Dim MyString As String
MyString = "Hello"
MyString = MyString & " there"

VB6 creates a new copy of the string each time you assign it to something. In this example, VB6 creates the string three times: once on the Dim statement, once on the second (Hello) statement, and again on the third statement, which appends the word "there" and returns a completely new string.

The fact that VB makes multiple copies of your string variables might not seem like a big deal with such a small example, but consider a program, for example, that manipulates large e-mail message strings. If the average size of an e-mail message is 1 MB, your program takes up another 1 MB of memory every time it appends to an e-mail message.

You might wonder, "What happens to all the dead strings when they're not in use?" Good question. A process called garbage collection occurs periodically, where VB goes through string memory and releases memory taken by strings no longer in use. The garbage collection process is built into the .NET Framework and occurs system-wide, no matter which language you write in.

Even if you use the standard String class in VB.NET, you still face the problem of copying string data constantly, which slows everything down. Here's the same code written in VB.NET:

Dim MyString As String
MyString = "Hello"
MyString = String.Concat(MyString, " there")

The String class's Concat method concatenates two strings together. It looks a little different from the previous VB6 code snippet, but the end result is the same. VB.NET copies the original string, returns the copy, and marks the original string for garbage collection, sort of like bringing the trashcan to the corner for pickup.

A Whole New String Game
Fortunately, the .NET Framework includes a class called StringBuilder, which resides in the System.Text namespace. Objects created from the StringBuilder class have all the string-manipulation methods a normal String object has, but with a big difference: These methods (for the most part) don't return a new string, but instead modify the underlying data. Here's what the .NET Framework help (installed with VS.NET beta 1) says about StringBuilder:

[StringBuilder] is convenient for situations in which it is desirable to modify a string, perhaps by removing, replacing, or inserting characters, without creating a new string subsequent to each modification. The methods contained within this class do not return a new StringBuilder object unless specified otherwise.

So you should create and manipulate your strings as StringBuilder objects, then convert them to a string using the built-in ToString method if you need a String object.

The StringBuilder resembles a String object, but it's not. Here's the same syntax using the StringBuilder class:

Dim MyString As New System.Text.StringBuilder()
MyString.Append("Hello")
MyString.Append(" There")

Note that you can't simply do this:

MyString = "Hello"

This code tries to assign a String object to a StringBuilder object, which VB.NET does not allow. This is just an example of how tightly typecast VB.NET is. You can turn off VB.NET's strict typecasting by placing Option Strict Off at the top of your modules (see the help file for details).

The difference between using the String and StringBuilder classes: StringBuilder's Append method works on the same data in memory and doesn't make a copy. Again, when you access large strings—such as XML recordsets, encoded files, or encrypted files—copying strings could present a performance problem. Let's see exactly how these two classes compare with a little benchmark program.

 
Figure 1 | How Fast is Fast? Click here.

The sample project, StringReplace, creates a long string of numbers separated by tab characters and replaces all the tabs with spaces (see Figure 1). The app times the string creation and replacement processes for both the String class (see Listing 1) and StringBuilder class (see Listing 2). Re-create the VB.NET project by creating a new Windows Application and adding instructional labels if desired (they're not required). Add a textbox to the form, name it txtCount, and set the Text property to 10000. Add two buttons to the form: Name the first button btnStringBuilder and set its Text property to StringBuilder; name the second button btnString and set its Text property to String. Add a label next to btnStringBuilder and name it lblCreate (Create Time); add a label next to btnString and name it lblModify (Modify Time). Type the code from Listing 1 and Listing 2 into btnString's and btnStringBuilder's Click event handlers, respectively.

You might notice that controls placed on the form through cut-and-paste have their Visible property set to False by default, rendering them invisible. This is a bug in beta 1.

Go From Zero to String in Seconds
Before I describe the code, just run it by pressing the String button. On my machine, it takes 5.087 seconds to create the string, and 0 seconds to replace the tabs with spaces. Now press the StringBuilder button. On my machine, it takes 0.170 seconds to create, and 0 seconds to replace. That's almost 30 times faster! Now that's a performance gain.

So, let me tell you what's going on in this code. Note this line (see Listing 1):

Dim Interval As Microsoft.VisualBasic.DateInterval

I could've said this just as easily:

Dim Interval As DateInterval 

Microsoft.VisualBasic is already in scope, and DateInterval belongs to Microsoft.VisualBasic, so you don't have to prefix DateInterval with Microsoft.VisualBasic. But I want you to see that you can't just pull objects out of thin air—they have to be in scope. I'll talk more about this in another article.

The first line of code after the Dim statements sets the value of the Interval variable:

Interval = DateInterval.Second

DateInterval.Second is a constant you access as a property of the DateInterval class. That's a little different from what you're used to, but it'll get easier to understand the more code you write.

Next, set the Max variable to the value the user specifies in the txtCount control:

   Max = CInt(txtCount.Text)

Max is an Integer, and an Integer in VB.NET is equivalent to a Long in VB6: 32 bits, or 4 bytes. In VB.NET, a Long is actually a 64-bit integer value. Again, notice you must strongly cast Max as an Integer.

The next line of code instantiates the StartTime variable and sets its value to the current time by accessing the Date class's Now property:

   StartTime = New Date().Now

That's a great feature of VB.NET: You can assign a value to a variable at the same time you create it.

The next block of code goes through a loop, adding the number (i) to the string each time as it increments i:

   For i = 1 To Max
      AddThis = CStr(i) & Chr(9)
      MyString = String.Concat(MyString, AddThis)
   Next

Each time through the loop, the variable AddThis contains the number in string form followed by a tab character (ASCII value of 9). Then the Concat method adds AddThis to MyString, a variable made from the String class, and returns a new string into the existing variable space. In VB6, the loop would look like this:

   For i = 1 to Max
      AddThis = CStr(i) & Chr(9)
      MyString = MyString & AddThis
   Next

Next, the code calculates the difference between the current time and start time using the DateDiff function and displays it in the label named lblCreate:

   lblCreate.text = CStr(DateDiff(Interval, _
      StartTime, Date.Now))

DateDiff takes an interval, which you've initialized to "seconds," and two dates. It returns the difference. Make sure you pass the later time last, or it will return a negative number.

Use the same timing technique for the next section, which replaces each tab in the string with a space:

MyString = MyString.Replace(Chr(9), _ 
   Chr(32))

The resulting time is displayed in the lblModify label.

Now look at the same process, but use the StringBuilder class instead of the String class (see Listing 2). For time's sake (yours as well as mine), I'll tell you only what's different. First note the creation of the StringBuilder object:

   Dim SB As New _
      System.Text.StringBuilder()

The StringBuilder class belongs to the System.Text namespace, so you must define the entire class name specifically.

Here's the code that appends the string with each number in a loop:

   For i = 1 To Max
      AddThis = CStr(i) & Chr(9)
      SB.Append(AddThis)
   Next

The StringBuilder class has an Append property that adds a string to the existing string. Note that you don't have to say this:

   SB = SB.Append(AddThis)

You could, but because Append doesn't return a new string, this would be pointless. It's the same with the replace code:

SB.Replace(Chr(9), Chr(32))

You don't have to return a new string because StringBuilder works on a single instance of the string data. Cool!

Finally, to return a real String object, call the ToString method:

MyString = SB.ToString

You must do this if you want to access the string that StringBuilder creates.

Carl Franklin is a VB.NET specialist, developer, and trainer. He is the MSDN regional director for Hartford, Conn., a trainer with Deep Training (www.deeptraining.com), and an author for John Wiley & Sons. His company, franklins.NET (www.franklins.net), provides .NET Web hosting and co-location, training, and mentoring. E-mail Carl at carl@franklins.net.

Show:
© 2015 Microsoft