First of all, don't review for code style; there are much better things on which to spend your time, than arguing over the placement of white spaces. Looking for bugs and performance issues is the primary goal and is where the majority of your time should be spent (during both the review and the walkthrough). Suggestions on maintenance best practices (covered in the "Maintainability and Robustness" section) can be brought up after the bugs have been discussed.
Memory Leaks
Memory leaks are the obvious scenario to which to pay close attention. Be sure to match up every allocation with a deallocation. Also, be sure to match calls to bracketed new[] with bracketed delete[]. It is also very important to do this at every point in the function in which it returns or can possibly generate an exception; memory leaks are very common during error handling.
VirtualFree Leaks
Watch out for VirtualFree calls that do not specify 0 for the size when also passing in MEM_RELEASE. The size must be 0 when using this form; otherwise, the memory is not freed and you will end up with a leak, even though it looks as if the memory is being deallocated.
NULL vs. throw
char *foo = new char[3];
if(!foo) return;
This code looks good, except for when new doesn't return NULL on error, but instead throws a C++ exception. It is recommended always to use the placement new syntax to enforce a behavior:
char *foo = new(std::nothrow) char[3];
This code tells the compiler to ensure that the new call will return a NULL and not throw an exception. The other solution is to control which version of libc is used at link time, which will define the behavior globally. But this can be very bad if you are using libraries (such as the STL) that require the throwing version of new.
If this code is in a library, it must use the placement syntax explicitly to enforce a behavior, as there is no guarantee which version of libc will be linked in with the final application.
Is alloca NULL?
char *foo = (char*)_alloca(3);
if(!foo) return;
As with the "NULL vs. throw" pattern, this code is wrong, because _alloca does not return NULL on failure. Instead, it throws an OS exception (not a C++ exception). If you must handle out of stack conditions, allocations should be wrapped in an appropriate exception handler. Also, note that (depending on the platform) _resetstkoflw() must be called if an exception is thrown; otherwise, your stack's guard page will be left in a bad state. Depending on your application, it may be better to architect it such that you can guarantee that enough space is available.
Integer Overflow to Buffer Overflow
if(a+b < 10) return;
char *buffer = new char[a+b];
if(buffer)
{
strncpy(buffer, input, a);
buffer[a-1] = '\0';
}
Here, it looks as if the code is doing the right thing. We check to make sure that our sizes are within expected limits. Then, after the allocation, we copy some data into the buffer. Unfortunately, if 'a' is significantly large, we can overflow the addition, so we allocate a buffer fewer than 10 characters, but then we overflow that buffer during the string copy.
When checking sizes for validity, they should be done separately. Also, using the "safe" version of strncpy is a better way to perform the copy (strncpy_s or StringCbCopyN).
During reviews, every time that you see an allocation in which the size goes through a series of mathematical operations, be sure to check for possible integer overflows.
Integer Overflow Inside new
Be aware that there is a multiply operation that happens inside the call to the new operator. Make sure to look for code checks, to ensure that sizeof(type) * number-of-objects does not overflow. The following code produces an overflow to zero (on a 32-bit platform), which is a valid allocation amount, but one that will always lead to a buffer overflow:
int size = 1073741824;
int *buffer = new int[size];
At this point, "buffer" is a valid pointer, but one that points to a zero-sized chunk of memory. Any write attempts to buffer[n] will result in memory corruption.
Note that some compilers (such as Microsoft Visual Studio 2005) will cap the buffer size, preventing an internal overflow of the value.
Array Indexing and Pointer Arithmetic
When you are working with pointers and arrays, it is almost too easy to walk off the end, or even the beginning, of a buffer. When a pointer is being indexed, look at the ranges that are being used, and verify (if possible) against the size of the buffer. You should be wary also of open-ended loops:
for(;;)
for(i=0; ; i++)
while(1)
while(*ptr)
With the last sample, we terminate based on the data itself, and not a buffer size. If the input is untrusted, this can cause a problem if the program is being handed corrupt or malicious data.
For and while loops that do have bounds are often sources of "off-by-one" errors. Check to make sure that the correct usage of "less than/greater than" versus "less than or equal/greater than or equal" is used. Be sure to watch also for integer overflows, as these will lead to reading or writing to the wrong memory address.
IsBadPtr
The general consensus is that the IsBad family of functions (IsBadReadPtr, IsBadWritePtr, and so forth) is broken and should not be used to validate pointers. For one thing, they are not thread-safe, which means that the APIs themselves can cause bugs if the address is in use by another thread at the same time. They only try to find one class of "bad" pointers; these APIs can tell you that everything is fine, when in fact it isn't.
It is not recommended to use this type of parameter validation, as it will often hide bugs. Any code that relies on these APIs for validation probably must be rewritten to be safer in the first place. However, if you must probe for access using a __try / __except pattern, it is better to do it yourself. In this way, you can catch guard-page exceptions and handle them correctly. The IsBad APIs do not do this, and they can leave your stack's guard page in a bad state.
Initializing sizeof(pointer)
When zeroing out a structure via a pointer, it is all too easy to specify the wrong size.
PBITMAPFILEHEADER bmp;
...
ZeroMemory(bmp, sizeof(bmp));
By passing sizeof(bmp) instead of sizeof(*bmp), we zero out only the size of a pointer instead of the struct. This simple bug can be hard to track down, as stale data on the stack can look valid. See also the "Initialized by Disjunct Size" section.
Close
As with memory allocations, handles must be freed when done. Every time that a handle is created, be sure to verify that there is a matching close. Also, be sure to check every place that the code returns or can possibly throw an exception.
Handles that are created by CreateThread must be closed. Note, however, that they can be closed while the thread is still running.
Related to this, you should watch also for double closes. Closing a handle twice can cause bad side effects, if the second close actually freed a valid handle that was created by somebody else.
INVALID_HANDLE_VALUE vs. NULL
Unfortunately, some Win32 APIs return INVALID_HANDLE_VALUE on error, and some return NULL. Be sure to read the documentation for each API that returns a handle, to see which value is used to represent an error. A common mistake is to check for the wrong identifier when looking for failures.
The following table shows a list of typical Win32 functions and what they return on error.
| CreateEvent | NULL |
| CreateFile | INVALID_HANDLE_VALUE |
| CreateMutex | NULL |
| CreateThread | NULL |
| FindFirstFile | INVALID_HANDLE_VALUE |
Critical Sections
With critical sections, you want to pay attention to the scope and the order in which resources are protected. Scope is important, because the resource must be protected for a sufficient amount of time to prevent concurrent accesses, but you also must keep the time that is spent in a lock to a minimum. The order in which locks are obtained and released is very important in preventing dead locks. Be sure to review critical sections carefully, and look for ways that multiple threads can end up blocking each other.
It is also a good idea to verify that a critical section is needed. Other mechanisms, such as Mutexes or the Interlocked APIs, may be better suited.
Unprotected Resources
Be sure to verify that all accesses to shared resources are synchronized. Memory is the most common, but read and writes to files, the network, and so forth may need to be controlled, too.
Freed Resources
A common problem with multithreading comes from using a resource on one thread that has been released by another. During reviews, look for places in which a thread can keep using memory, handles, and so forth, after they have been released.
Orphaned Threads
Another issue for which to watch out is when a program exits (or a DLL gets unloaded) without waiting for all of its worker threads to finish first. Be sure to review the thread-management scheme, and look for places in which the main application does not wait on worker threads to finish.
Single-Threaded CRT
If the application is using threads, be sure to check the project settings, to ensure that it is not using the single-threaded CRT library. Using the non-thread-safe CRT functions in a multithreaded environment can lead to some very difficult-to-diagnose bugs. It is much easier to find and fix this through a review.
Note that this is not an issue on Visual C++ 2005 and Microsoft Windows projects, as the single-threaded CRT no longer exists.
AND OR
When you see an expression that is using logical or bitwise operators, be sure to investigate the intent. It is common to use & accidentally, when && was meant, or to interchange && and ||.
Overloaded AND OR
If a class overloads the && or || operator, be aware that expression short-circuiting rules will be broken. In the following example, if 'string' is a character pointer, strlen will not be called if the pointer is equivalent to NULL:
if(string && strlen(string))
Now, if 'string' is actually a class that happens to overload the && operator, the resulting code will actually look to the compiler as the following:
if(string.operator&&( strlen(string) ))
In this case, strlen(string) is always evaluated, so the semantics of this code have completely changed.
Precedence
People often mistakenly interpret a given operator as having a lesser or greater importance, depending on the context. Unfortunately, this can lead to precedence errors when expressions are evaluated. Most of the time, it is better to have too many parentheses than too few. For example, the following two lines are equivalent:
(x * 2) + 1;
(x << 1) + 1;
However, the following two are not:
This problem is often more frequent in C++, when a class overloads the operators as expectations of operator importance may change. For example, the "streaming" operators that are used with the I/O classes often give users a different impression of the precedence:
The preceding line is incorrect, because the shift operator will be evaluated before the bitwise AND operator.
Sequence Points
In the following code, there is no guarantee which function will be called first:
DoSomething() + DoSomethingElse()
The following statement, however, guarantees that DoSomething will be called first:
DoSomething() , DoSomethingElse()
This is so, because the standard does not define (outside of "sequence points") which substatements/expressions are evaluated first. The following operators and conditions create sequence points:
- && – The left-hand side is fully evaluated before the right-hand side.
- || – The left hand side is fully evaluated before the right-hand side.
- , – The comma operator when used in an expression. It is not a sequence point when used to separate function parameters, which means that function inputs are not evaluated in any specific order.
-
Function call – All inputs are evaluated before entering the function.
- ? : – The expression before the question mark is fully evaluated before any of the conditionals.
- ; – The statement end is a sequence point.
- if, switch, for, while, and do while – The "conditional" portion is fully evaluated before the body.
Note that operator overloading will convert these operators into the "Function call" semantics. Note also that parentheses do not introduce sequence points.
What this boils down to is: Watch out for expressions with side effects (function calls and assignments) that happen between sequence points. If you see these, the code is buggy, and the order of evaluation may change with compiler updates. The following are common examples of code to watch for:
Function inputs are not evaluated in any specific order, so the first parameter in this sample may be zero, or it may be something else.
In the following expression, there is no guarantee that the first value that is read from the stream will be multiplied by 256:
x = ReadFromStream() * 256 + ReadFromStream();
Multiple assignments that use the same variable between sequence points are bad; these should always be avoided:
Resource Use After Release
Watch out for cases in which resources are used after they have been released. This can happen with pretty much any type of resource: memory, handles, critical sections, and so forth. It is good defensive practice to invalidate the association to the resource after it has been released. Doing so will help catch these instances without trying to debug something that fails only a percentage of the time (because the data that is being reused "looks" correct).
Calling Convention
As new public APIs are added to a library, it is not uncommon to forget to state explicitly which calling convention is used. When you review the header file, ensure that all the public functions have an explicit declaration of the calling convention.
Formatting
When an application uses sprintf style formatting, be sure to validate the number of arguments and the types. The following are some of the common issues:
64-Bit Values
When you write out a 64-bit value, you must use the proper type modifier:
This should be the following:
printf("%I64u", val64bits);
Untrusted Inputs
If the input buffer comes from an untrusted source, this can result in a security attack, as the input can contain formatting specifiers that will pull data off the stack. Additionally, it can cause buffer overflows by expanding the resulting string beyond expected limits:
These should always be converted to the following:
NULL Termination
Also, be sure to verify the size that is passed to snprintf. This function does not terminate the output buffer with a NULL character, if the resulting formatted text maxes out the buffer. It should be a habit always to add in a NULL terminator after calling this family of functions:
char output[SPECIAL_NUMBER];
snprintf(output, ARRAYSIZE(output), "%s", input);
output[ARRAYSIZE(output)-1] = '\0';
strncmp
When you compare the beginning of a buffer with a static string, be sure to validate that the passed-in size is correct. Also, watch for hard-coded values or improper uses of sizeof:
char prefix[] = "HELLO";
strncmp(string, prefix, 5);
strncmp(string, prefix, sizeof(prefix));
Using the hard-coded value of 5 is dangerous, as the prefix may change later. The sizeof statement is incorrect, as it will return 6.
Using strlen is often the safest approach, although it may incur a performance loss:
strncmp(string, prefix, strlen(prefix));
Safe String Functions
Moving away from the standard CRT string-handling functions to the safer StringCb functions is a good practice. However, be sure to review the sizes that are passed to these functions. They are no safer than the original versions, if they are lied to.
Switch Fall-Through
Review all switch cases for accidental fall-through conditions. Every place that is explicitly designed to fall through should be commented as such.
Equals TRUE
When dealing with pseudo Booleans (ones represented as integers under the covers), be aware of code that checks equivalence to TRUE:
result = Funct();
if(result == TRUE)
The problem with this is that any nonzero value is "true," so that comparing to a single value is probably not correct. Instead, the code should be written as follows:
if(result != FALSE)
if(result)
Code that uses the C++ bool built-in type does not suffer from this, as the value can be only true or false. When reviewing code, be sure that the source does not try to mix and match between representations (BOOL vs. bool), as they are different types and shouldn't be cross-pollinating.
Also be aware that sizeof(BOOL) is 4 (on 32-bit platforms), whereas sizeof(bool) is always 1.
Equals S_OK
HRESULT result = Funct();
if(result == S_OK)
The preceding code suffers the same problems that were outlined in the previous pattern. Successful HRESULTs can be represented with more than one numerical value. Use the SUCCEEDED or FAILED macro to check for result status on returned HRESULT types.
Return S_FALSE
If a function returns S_FALSE, be sure to trace through the call paths and review how the return values are interpreted. It may not have the same meaning as "the function completed through to success," and so checking only for SUCCEEDED or FAILED may not be the correct thing to do. It may be better to check for S_FALSE explicitly.
Also of note is that S_FALSE does not equal zero and, thus, does not equal the definition for FALSE (or, for that matter, lowercase "false").
__assume and __restrict
These keywords allow the compiler to make some very powerful optimizations. Unfortunately, if you lie to the compiler, it can go and generate some very wrong code. These bugs can be hard to find, as a cursory look through the C source code will not indicate a bug; the algorithm can be coded correctly, but the output is incorrect.
When you see the __restrict keyword being used on input parameters, make sure to review everywhere that the function is called, to ensure that any buffers can never overlap. If they do overlap, any operations that are performed on the memory may be performed with stale and incorrect data.
The __assume keyword can be just as dangerous. When you see the code author making an assumption about the value of a variable, be sure that you can validate and confirm the theory.
If 'counter' can ever be greater than 10, the generated code may or may not do the expected thing. Trying to debug when the "wrong" thing happens can be very difficult. It is also good practice to use compile-time asserts to help validate the assumptions: Everywhere that you use an __assume, you should use an assert.
Signed vs. Unsigned
When creating local variables, especially ones that are used to index through a loop, look for instances in which the developer used a signed integer instead of an unsigned type. Most often than not, the value has no reason to go negative.
Rethrowing Objects by Name
If you catch and then rethrow the caught object, it is improper form to use the name of the object in the throw statement:
catch(MyClass &b)
{
throw b;
}
Doing so will create a temporary of the caught type. If the thrown object is actually that of a derived class, but caught by the type of the base class, you will lose any information that is associated with the derived object.
By using the "throw" keyword by itself, you will no longer rethrow a temporary and, thus, retain the full identity and data of the original object:
catch(MyClass &b)
{
throw;
}