Chapter 4 Arrays, Strings, and Pointers, Section 2

Article
06/07/2011

Applies to: Visual Studio 2010

Provided by: Ivor Horton

Book Cover

This topic contains the following sections.

Dynamic Memory Allocation
Using References
Native C++ Library Functions for Strings
Copyright

Dynamic Memory Allocation

Working with a fixed set of variables in a program can be very restrictive. You will often want to decide the amount of space to be allocated for storing different types of variables at execution time, depending on the input data for the program. Any program that involves reading and processing a number of data items that is not known in advance can take advantage of the ability to allocate memory to store the data at run time. For example, if you need to implement a program to store information about the students in a class, the number of students is not fixed, and their names will vary in length, so to deal with the data most efficiently, you’ll want to allocate space dynamically at execution time.

Obviously, because dynamically allocated variables can’t have been defined at compile time, they can’t be named in your source program. When they are created, they are identified by their address in memory, which is contained within a pointer. With the power of pointers, and the dynamic memory management tools in Visual C++ 2010, writing your programs to have this kind of flexibility is quick and easy.

The Free Store, Alias the Heap

In most instances, when your program is executed, there is unused memory in your computer. This unused memory is called the heap in C++, or sometimes the free store. You can allocate space within the free store for a new variable of a given type using a special operator in C++ that returns the address of the space allocated. This operator is new, and it’s complemented by the operator delete, which de-allocates memory previously allocated by new.

You can allocate space in the free store for some variables in one part of a program, and then release the allocated space and return it to the free store after you have finished with it. This makes the memory available for reuse by other dynamically allocated variables, later in the same program. This can be a powerful technique; it enables you to use memory very efficiently, and in many cases, it results in programs that can handle much larger problems, involving considerably more data than otherwise might be possible.

The new and delete Operators

Suppose that you need space for a double variable. You can define a pointer to type double and then request that the memory be allocated at execution time. You can do this using the operator new with the following statements:

double* pvalue(nullptr);
pvalue = new double;      // Request memory for a double variable

This is a good moment to recall that all pointers should be initialized. Using memory dynamically typically involves a number of pointers floating around, so it’s important that they should not contain spurious values. You should try to arrange for a pointer not containing a legal address value to be set to nullptr.

The new operator in the second line of code above should return the address of the memory in the free store allocated to a double variable, and this address is stored in the pointer pvalue. You can then use this pointer to reference the variable using the indirection operator, as you have seen. For example:

*pvalue = 9999.0;

Of course, the memory may not have been allocated because the free store had been used up, or because the free store is fragmented by previous usage — meaning that there isn’t a sufficient number of contiguous bytes to accommodate the variable for which you want to obtain space. You don’t have to worry too much about this, however. The new operator will throw an exception if the memory cannot be allocated for any reason, which terminates your program. Exceptions are a mechanism for signaling errors in C++; you learn about these in Chapter 6.

You can also initialize a variable created by new. Taking the example of the double variable that was allocated by new and the address stored in pvalue, you could have set the value to 999.0, as it was created with this statement:

pvalue = new double(999.0);   // Allocate a double and initialize it

Of course, you could create the pointer and initialize it in a single statement, like this:

double* pvalue(new double(999.0));

When you no longer need a variable that has been dynamically allocated, you can free up the memory that it occupies in the free store with the delete operator:

delete pvalue;                // Release memory pointed to by pvalue

This ensures that the memory can be used subsequently by another variable. If you don’t use delete, and subsequently store a different address value in the pointer pvalue, it will be impossible to free up the memory, or to use the variable that it contains, because access to the address is lost. In this situation, you have what is referred to as a memory leak, especially when it recurs in your program.

Allocating Memory Dynamically for Arrays

Allocating memory for an array dynamically is very straightforward. If you wanted to allocate an array of type char, assuming pstr is a pointer to char, you could write the following statement:

pstr = new char[20];     // Allocate a string of twenty characters

This allocates space for a char array of 20 characters and stores its address in pstr.

To remove the array that you have just created in the free store, you must use the delete operator. The statement would look like this:

delete [] pstr;          // Delete array pointed to by pstr

Note the use of square brackets to indicate that what you are deleting is an array. When removing arrays from the free store, you should always include the square brackets, or the results will be unpredictable. Note also that you do not specify any dimensions here, simply [].

Of course, the pstr pointer now contains the address of memory that may already have been allocated for some other purpose, so it certainly should not be used. When you use the delete operator to discard some memory that you previously allocated, you should always reset the pointer, like this:

pstr = nullptr;

This ensures that you do not attempt to access the memory that has been deleted.

Try it Out: Using Free Store

You can see how dynamic memory allocation works in practice by rewriting the program that calculates an arbitrary number of primes, this time using memory in the free store to store the primes.

// Ex4_11.cpp
// Calculating primes using dynamic memory allocation
#include <iostream>
#include <iomanip>
using std::cin;
using std::cout;
using std::endl;
using std::setw;
        
int main()
{
   long* pprime(nullptr);         // Pointer to prime array

   long trial(5);                 // Candidate prime
   int count(3);                  // Count of primes found
   int found(0);                  // Indicates when a prime is found
   int max(0);                    // Number of primes required
        
   cout << endl
        << "Enter the number of primes you would like (at least 4): ";
   cin >> max;                    // Number of primes required
        
   if(max < 4)                    // Test the user input, if less than 4
      max = 4;                    // ensure it is at least 4
        
   pprime = new long[max];
        
   *pprime = 2;                   // Insert three
   *(pprime + 1) = 3;             // seed primes
   *(pprime + 2) = 5;
        
   do
   {
      trial += 2;                            // Next value for checking
      found = 0;                             // Set found indicator
        
      for(int i = 0; i < count; i++)         // Division by existing primes
      {
         found =(trial % *(pprime + i)) == 0;// True for exact division
         if(found)                           // If division is exact
            break;                           // it's not a prime
      }
        
      if (found == 0)                  // We got one...
         *(pprime + count++) = trial;  // ...so save it in primes array
   } while(count < max);
        
   // Output primes 5 to a line
   for(int i = 0; i < max; i++)
   {
      if(i % 5 == 0)                   // New line on 1st, and every 5th line
         cout << endl;
      cout << setw(10) << *(pprime + i);
   }
        
   delete [] pprime;                         // Free up memory
   pprime = nullptr;                         // and reset the pointer
   cout << endl;
   return 0;
}

Here’s an example of the output from this program:

Enter the number of primes you would like (at least 4): 20
         2         3         5         7        11
        13        17        19        23        29
        31        37        41        43        47
        53        59        61        67        71

Tip

In fact, the program is similar to the previous version. After receiving the number of primes required in the int variable max, you allocate an array of that size in the free store using the operator new. Note that you have made sure that max can be no less than 4. This is because the program requires space to be allocated in the free store for at least the three seed primes, plus one new one. You specify the size of the array that is required by putting the variable max between the square brackets following the array type specification:

   pprime = new long[max];

You store the address of the memory area that is allocated by new in the pointer pprime. The program would terminate at this point if the memory could not be allocated.

After the memory that stores the prime values has been successfully allocated, the first three array elements are set to the values of the first three primes:

   *pprime = 2;                   // Insert three
   *(pprime + 1) = 3;             // seed primes
   *(pprime + 2) = 5;

You are using the dereference operator to access the first three elements of the array. As you saw earlier, the parentheses in the second and third statements are there because the precedence of the * operators is higher than that of the + operator.

You can’t specify initial values for elements of an array that you allocate dynamically. You have to use explicit assignment statements if you want to set initial values for elements of the array.

The calculation of the prime numbers is exactly as before; the only change is that the name of the pointer you have here, pprime, is substituted for the array name, primes, that you used in the previous version. Equally, the output process is the same. Acquiring space dynamically is really not a problem at all. After it has been allocated, it in no way affects how the computation is written.

After you finish with the array, you remove it from the free store using the delete operator, remembering to include the square brackets to indicate that it is an array you are deleting.

   delete [] pprime;             // Free up memory

Although it’s not essential here, you also reset the pointer:

   pprime = nullptr;            // and reset the pointer

All memory allocated in the free store is released when your program ends, but it is good to get into the habit of resetting pointers to nullptr when they no longer point to valid memory areas.

Dynamic Allocation of Multidimensional Arrays

Allocating memory in the free store for a multidimensional array involves using the new operator in a slightly more complicated form than is used for a one-dimensional array. Assuming that you have already declared the pointer pbeans appropriately, to obtain the space for the array beans[3][4] that you used earlier in this chapter, you could write this:

pbeans = new double [3][4];         // Allocate memory for a 3x4 array

You just specify both array dimensions between square brackets after the type name for the array elements.

Allocating space for a three-dimensional array simply requires that you specify the extra dimension with new, as in this example:

pBigArray = new double [5][10][10]; // Allocate memory for a 5x10x10 array

However many dimensions there are in the array that has been created, to destroy it and release the memory back to the free store, you write the following:

delete [] pBigArray;                // Release memory for array
pBigArray = nullptr;

You always use just one pair of square brackets following the delete operator, regardless of the dimensionality of the array with which you are working.

You have already seen that you can use a variable as the specification of the dimension of a one-dimensional array to be allocated by new. This extends to two or more dimensions, but with the restriction that only the leftmost dimension may be specified by a variable. All the other dimensions must be constants or constant expressions. So, you could write this:

pBigArray = new double[max][10][10];

where max is a variable; however, specifying a variable for any dimension other than the left-most causes an error message to be generated by the compiler.

Using References

A reference appears to be similar to a pointer in many respects, which is why I’m introducing it here, but it really isn’t the same thing at all. The real importance of references becomes apparent only when you get to explore their use with functions, particularly in the context of object-oriented programming. Don’t be misled by their simplicity and what might seem to be a trivial concept. As you will see later, references provide some extraordinarily powerful facilities, and in some contexts enable you to achieve results that would be impossible without them.

What Is a Reference?

There are two kinds of references: lvalue references and rvalue references. Essentially, a reference is a name that can be used as an alias for something else.

An lvalue reference is an alias for another variable; it is called an lvalue reference because it refers to a persistent storage location that can appear on the left of an assignment operation. Because an lvalue reference is an alias and not a pointer, the variable for which it is an alias has to be specified when the reference is declared; unlike a pointer, a reference cannot be altered to represent another variable.

An rvalue reference can be used as an alias for a variable, just like an lvalue reference, but it differs from an lvalue reference in that it can also reference an rvalue, which is a temporary value that is essentially transient.

Declaring and Initializing Lvalue References

Suppose that you have declared a variable as follows:

long number(0L);

You can declare an lvalue reference for this variable using the following declaration statement:

long& rnumber(number);      // Declare a reference to variable number

The ampersand following the type name long and preceding the variable name rnumber, indicates that an lvalue reference is being declared, and that the variable name it represents, number, is specified as the initializing value following the equals sign; therefore, the variable rnumber is of type ‘reference to long’. You can now use the reference in place of the original variable name. For example, this statement,

rnumber += 10L;

has the effect of incrementing the variable number by 10.

Note that you cannot write:

int& refData = 5;            // Will not compile!

The literal 5 is constant and cannot be changed. To protect the integrity of constant values, you must use a const reference:

const int & refData = 5;     // OK

Now you can access the literal 5 through the refData reference. Because you declare refData as const, it cannot be used to change the value it references.

Let’s contrast the lvalue reference rnumber defined above with the pointer pnumber, declared in this statement:

long* pnumber(&number);       // Initialize a pointer with an address

This declares the pointer pnumber, and initializes it with the address of the variable number. This then allows the variable number to be incremented with a statement such as:

*pnumber += 10L;               // Increment number through a pointer

There is a significant distinction between using a pointer and using a reference. The pointer needs to be dereferenced, and whatever address it contains is used to access the variable to participate in the expression. With a reference, there is no need for de-referencing. In some ways, a reference is like a pointer that has already been dereferenced, although it can’t be changed to reference something else. An lvalue reference is the complete equivalent of the variable for which it is a reference.

Defining and Initializing Rvalue References

You specify an rvalue reference type using two ampersands following the type name. Here’s an example:

int x(5);
int&& rx = x;

The first statement defines the variable x with the initial value 5, and the second statement defines an rvalue reference, rx, that references x. This shows that you can initialize an rvalue reference with an lvalue so it that can work just like an lvalue reference. You can also write this as:

int&& rExpr = 2*x + 3;

Here, the rvalue reference is initialized to reference the result of evaluating the expression 2*x+3, which is a temporary value — an rvalue. You cannot do this with an lvalue reference. Is this useful? In this case, no; but in a different context, it is very useful.

While the code fragments relating to references illustrate how lvalue and rvalue reference variables can be defined and initialized, this is not how they are typically used. The primary application for both types of references is in defining functions where they can be of immense value; you’ll learn more about this later in the book, starting in Chapter 5.

Native C++ Library Functions for Strings

The standard library provides the cstring header that contains functions that operate on null-terminated strings. These are a set of functions that are specified to the C++ standard. There are also alternatives to some of these functions that are not standard, but which provide a more secure implementation of the function than the original versions. In general, I’ll mention both where they exist in the cstring header, but I’ll use the more secure versions in examples. Let’s explore some of the most useful functions provided by the cstring header.

Note

The string standard header for native C++ defines the string and wstring classes that represent character strings. The string class represents strings of characters of type char and the wstring class represents strings of characters of type wchar_t. Both are defined in the string header as template classes that are instances of the basic_string<T> class template. A class template is a parameterized class (with parameter T in this case) that you can use to create new classes to handle different types of data. I won’t be discussing templates and the string and wstring classes until Chapter 8, but I thought I’d mention them here because they have some features in common with the functions provided by the String type that you’ll be using in C++/CLI programs later in this chapter. If you are really interested to see how they compare, you could always have a quick look at the section in Chapter 8 that has the same title as this section. It should be reasonably easy to follow at this point, even without knowledge of templates and classes.

Finding the Length of a Null-Terminated String

The strlen() function returns the length of the argument string of type char* as a value of type size_t. The type size_t is an implementation-defined type that corresponds to an unsigned integer type that is used generally to represent the lengths of sequences of various kinds. The wcslen() function does the same thing for strings of type wchar_t*.

Here’s how you use the strlen() function:

char * str("A miss is as good as a mile.");
cout << "The string contains " <<  strlen(str) << " characters." << endl;

The output produced when this fragment executes is:

The string contains 28 characters.

As you can see from the output, the length value that is returned does not include the terminating null. It is important to keep this in mind, especially when you are using the length of one string to create another string of the same length.

Both strlen() and wcslen() find the length by looking for the null at the end. If there isn’t one, the functions will happily continue beyond the end of the string, checking throughout memory in the hope of finding a null. For this reason, these functions represent a security risk when you are working with data from an untrusted external source. In this situation you can use the strnlen() and wcsnlen() functions, both of which require a second argument that specifies the length of the buffer in which the string specified by the first argument is stored.

Joining Null-Terminated Strings

The strcat() function concatenates two null-terminated strings. The string specified by the second argument is appended to the string specified by the first argument. Here’s an example of how you might use it:

char str1[30]= "Many hands";
char* str2(" make light work.");
strcat(str1, str2);
cout << str1 << endl;

Note that the first string is stored in the array str1 of 30 characters, which is far more than the length of the initializing string, “Many hands”. The string specified by the first argument must have sufficient space to accommodate the two strings when they are joined. If it doesn’t, disaster will surely result because the function will then try to overwrite the area beyond the end of the first string.

Figure 4-9

Referenced Screen

As Figure 4-9 shows, the first character of the string specified by the second argument overwrites the terminating null of the first argument, and all the remaining characters of the second string are copied across, including the terminating null. Thus, the output from the fragment will be:

Many hands make light work.

The strcat() function returns the pointer that is the first argument, so you could combine the last two statements in the fragment above into one:

cout << strcat(str1, str2) << endl;

The wcscat() function concatenates wide-character strings, but otherwise works exactly the same as the strcat() function.

With the strncat() function you can append part of one null-terminated string to another. The first two arguments are the destination and source strings respectively, and the third argument is a count of the number of characters from the source string that are to be appended. With the strings as defined in Figure 4-9, here’s an example of using strncat():

cout << strncat(str1, str2, 11) << endl;

After executing this statement, str1 contains the string “Many hands make light”. The operation appends 11 characters from str2 to str1, overwriting the terminating ‘\0’ in str1, and then appends a final ‘\0’ character. The wcsncat() provides the same capability as strncat() but for wide-character strings.

All the functions for concatenating strings that I have introduced up to now rely on finding the terminating nulls in the strings to work properly, so they are also insecure when it comes to dealing with untrusted data. The strcat_s(), wcscat_s(), strncat_s(), and wcsncat_s() functions in <cstring> provide secure alternatives. Just to take one example, here’s how you could use strcat_s() to carry out the operation shown in Figure 4-9:

const size_t count = 30;
char str1[count]= "Many hands";
char* str2(" make light work.");
        
errno_t error = strcat_s(str1, count, str2);
        
if(error == 0)
  cout << " Strings joined successfully." << endl;
        
else if(error == EINVAL)
  cout << "Error! Source or destination string is NULL." << endl;
        
else if(error == ERANGE)
  cout << " Error! Destination string too small." << endl;

For convenience, I defined the array size as the constant count. The first argument to strcat_s() is the destination string to which the source string specified by the third argument is to be appended. The second argument is the total number of bytes available at the destination. The function returns an integer value of type errno_t to indicate how things went. The error return value will be zero if the operation is successful, EINVAL if the source or destination is NULLPTR, or ERANGE if the destination length is too small. In the event of an error occurring, the destination will be left unchanged. The error code values EINVAL and ERANGE are defined in the cerrno header, so you need an #include directive for this, as well as for cstring, to compile the fragment above correctly. Of course, you are not obliged to test for the error codes that the function might return, and if you don’t, you won’t need the #include directive for cerrno.

Copying Null-Terminated Strings

The standard library function strcpy() copies a string from a source location to a destination. The first argument is a pointer to the destination location, and the second argument is a pointer to the source string; both arguments are of type char*. The function returns a pointer to the destination string. Here’s an example of how you use it:

const size_t LENGTH = 22;
const char source[LENGTH] ="The more the merrier!";
char destination[LENGTH];
cout << "The destination string is: " << strcpy(destination, source)
     << endl;

The source string and the destination buffer can each accommodate a string containing 21 characters plus the terminating null. You copy the source string to destination in the last statement. The output statement makes use of the fact that the strcpy() function returns a pointer to the destination string, so the output is:

The destination string is: The more the merrier!

You must ensure that the destination string has sufficient space to accommodate the source string. If you don’t, something will get overwritten in memory, and disaster is the likely result.

The strcpy_s() function is a more secure version of strcpy(). It requires an extra argument between the destination and source arguments that specifies the size of the destination string buffer. The strcpy_s() function returns an integer value of type errno_t that indicates whether an error occurred. Here’s how you might use this function:

const size_t LENGTH(22);
const char source[LENGTH] ="The more the merrier!";
char destination[LENGTH];
        
errno_t error = strcpy_s(destination, LENGTH, source);
        
if(error == EINVAL)
cout << "Error. The source or the destination is NULLPTR." << endl;
else if(error == ERANGE)
  cout << "Error. The destination is too small." << endl;
else
  cout << "The destination string is: " << destination << endl;

You need to include the cstring and cerrno headers for this to compile. The strcpy_s() function verifies that the source and destination are not NULLPTR and that the destination buffer has sufficient space to accommodate the source string. When either or both the source and destination are NULLPTR, the function returns the value EINVAL. If the destination buffer is too small, the function returns ERANGE. If the copy is successful, the return value is 0.

You have analogous wide-character versions of these copy functions; these are wcscpy() and wcscpy_s().

Comparing Null-Terminated Strings

The strcmp() function compares two null-terminated strings that you specify by arguments that are pointers of type char*. The function returns a value of type int that is less than zero, zero, or greater than 0, depending on whether the string pointed to by the first argument is less than, equal to, or greater than the string pointed to by the second argument. Here’s an example:

char* str1("Jill");
char* str2("Jacko");
int result = strcmp(str1, str2);
if(result < 0)
  cout << str1 << " is less than " << str2 << '.' << endl;
else if(0 == result)

  cout << str1 << " is equal to " << str2 << '.' << endl;
else
  cout << str1 << " is greater than " << str2 << '.' << endl;

This fragment compares the strings str1 and str2, and uses the value returned by strcmp() to execute one of three possible output statements.

Comparing the strings works by comparing the character codes of successive pairs of corresponding characters. The first pair of characters that are different determines whether the first string is less than or greater than the second string. Two strings are equal if they contain the same number of characters, and the corresponding characters are identical. Of course, the output is:

Jill is greater than Jacko.

The wcscmp() function is the wide-character string equivalent of strcmp().

Searching Null-Terminated Strings

The strspn() function searches a string for the first character that is not contained in a given set and returns the index of the character found. The first argument is a pointer to the string to be searched, and the second argument is a pointer to a string containing the set of characters. You could search for the first character that is not a vowel like this:

char* str = "I agree with everything.";
char* vowels = "aeiouAEIOU ";
size_t index = strspn(str, vowels);
cout << "The first character that is not a vowel is '" << str[index]
     << "' at position " << index << endl;

This searches str for the first character that is not contained in vowels. Note that I included a space in the vowels set, so a space will be ignored so far as the search is concerned. The output from this fragment is:

The first character that is not a vowel is 'g' at position 3

Another way of looking at the value the strspn() function returns is that it represents the length of the substring, starting from the first character in the first argument string that consists entirely of characters in the second argument string. In the example it is the first three characters “I a”.

The wcsspn() function is the wide-character string equivalent of strspn().

The strstr() function returns a pointer to the position in the first argument of a substring specified by the second argument. Here’s a fragment that shows this in action:

char* str = "I agree with everything.";
char* substring = "ever";
char* psubstr = strstr(str, substring);
        
if(!psubstr)
  cout << "\"" << substring << "\" not found in \"" << str << "\"" << endl;
else
  cout << "The first occurrence of \"" << substring
       << "\" in \"" << str << "\" is at position "
       << psubstr-str << endl;

The third statement calls the strstr() function to search str for the first occurrence of the substring. The function returns a pointer to the position of the substring if it is found, or NULL when it is not found. The if statement outputs a message, depending on whether or not substring was found in str. The expression psubstr-str gives the index position of the first character in the substring. The output produced by this fragment is:

The first occurrence of "ever" in "I agree with everything." is at position 13

Try it Out: Searching Null - Terminated Strings

This example searches a given string to determine the number of occurrences of a given substring.

// Ex4_12.cpp
// Searching a string
#include <iostream>
#include <cstring>
using std::cout;
using std::endl;
using std::strlen;
using std::strstr;
        
int main()
{
  char* str("Smith, where Jones had had \"had had\" had had \"had\"."
                         "\n\"Had had\" had had the examiners' approval.");
  char* word("had");
  cout << "The string to be searched is: "
       << endl << str << endl;
        
  int count(0);                   // Number of occurrences of word in str
  char* pstr(str);                // Pointer to search start position
  char* found(nullptr);           // Pointer to occurrence of word in str
        
  while(true)
  {
    found = strstr(pstr, word);
    if(!found)
      break;
    ++count;
    pstr = found+strlen(word);   // Set next search start as 1 past the word found
  }
        
  cout << "\"" << word << "\" was found "
       << count << " times in the string." << endl;
  return 0;
}

The output from this example is:

The string to be searched is: Smith, where Jones had had "had had" had had "had".
"Had had" had had the examiners' approval.
"had" was found 10 times in the string.

Tip

All the action takes place in the indefinite while loop:

  while(true)
  {
    found = strstr(pstr, word);
    if(!found)
      break;
    ++count;
    pstr = found+strlen(word);    // Set next search start as 1 past the word found
  }

The first step is to search the string for word starting at position pstr, which initially is the beginning of the string. You store the address that strstr() returns in found; this will be nullptr if word was not found in pstr, so the if statement ends the loop in that case.

If found is not nullptr, you increment the count of the number of occurrences of word, and update the pstr pointer so that it points to one character past the word instance that was found in pstr. This will be the starting point for the search on the next loop iteration.

From the output, you can see that word was found ten times in str. Of course, “Had” doesn’t count because it starts with an uppercase letter.

Copyright

Ivor Horton’s Beginning Visual C++® 2010, Copyright © 2010 by Ivor Horton, ISBN: 978-0-470-50088-0, Published by Wiley Publishing, Inc., All Rights Reserved. Wrox, the Wrox logo, Wrox Programmer to Programmer, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc and/or its affiliates.