1 out of 1 rated this helpful - Rate this topic

_ReadWriteBarrier

Microsoft Specific

Forces reads and writes to memory to complete at the point of the call.

Caution noteCaution:

The _ReadBarrier, _WriteBarrier, and _ReadWriteBarrier compiler intrinsics prevent only compiler re-ordering. To prevent the CPU from re-ordering read and write operations, use the MemoryBarrier macro.

void _ReadWriteBarrier(void);

Intrinsic

Architecture

_ReadWriteBarrier

x86, IPF, x64

Header file <intrin.h>

The _ReadBarrier, _WriteBarrier, and _ReadWriteBarrier functions help ensure the correct operation of multithreaded programs that are optimized by the Visual C++ compiler. A correctly optimized program yields the same results when it executes on multiple threads as when it executes on a single thread.

The point in an application where a _ReadBarrier, _WriteBarrier, or _ReadWriteBarrier function executes is called a memory barrier. A memory barrier can be for reads, writes, or both.

An instruction that accesses a variable in memory might be deleted or moved across a memory barrier as part of an optimization. Consequently, a thread might read an old value from a global variable before another thread completes writing a new value to the variable, or write a new value before another thread completes reading an old value from the variable.

To help ensure that the optimized program operates correctly, the _ReadWriteBarrier function forces reads and writes to memory to complete at the point of the call. After the call, other threads can access the memory without fear that the thread that made the call might have a pending read or write to the memory. A memory barrier prevents the compiler from optimizing memory accesses across the barrier, but enables the compiler to still optimize instructions between barriers.

Marking memory with a memory barrier is similar to marking memory with the volatile (C++) keyword. However, a memory barrier is more efficient because reads and writes are forced to complete at specific points in the program rather than globally. The optimizations that can occur if a memory barrier is used cannot occur if the variable is declared volatile.

Note Note:

In past versions of the Visual C++ compiler, the _ReadWriteBarrier and _WriteBarrier functions were enforced only locally and did not affect functions up the call tree. In Visual C++ 2005 and later, these functions are enforced all the way up the call tree.

Affected Memory

Global variables are affected by memory barriers, but typically local variables are not. Most local variables are not accessible to other threads and therefore do not need protection. A variable is affected by a memory barrier if it satisfies one of the following conditions:

  • The variable is a global variable.

  • The variable is a local variable used in a __try, __except, or __finally block if structured exception handling is used, or a catch block if C++ exception handling is used. For more information, see the /EHa compiler option.

  • The variable is a local variable that is declared volatile.

  • The variable is a local variable whose address escapes the current function in some way. For example, the variable is passed by reference to another function or its address is assigned to a global variable.

  • The variable is accessed indirectly through a pointer and the dereferenced pointer satisfies one of the previous conditions. The most typical case is *p where p is a global variable or parameter.

// intrinsics_readwritebarrier.c
// compile with: /O2 -DNO_BARRIER
// This code contains an error--dereferencing a null pointer--
// which will be optimized away as a useless assignment.
// Omit the NO_BARRIER command line to activate the Write Barrier.
// With the barrier activated, the assignment is not optimized away
// and causes an access violation.

#include <windows.h> // for EXCEPTION_ACCESS_VIOLATION
#include <excpt.h>
#include <stdio.h>
#include <intrin.h>

#pragma intrinsic(_ReadWriteBarrier)

int x = 0;

__declspec(noinline) int f(int* p)
{
    x = *p;
#ifndef NO_BARRIER
    _ReadWriteBarrier();
#endif
    x = 7;
    return x;
}


// If code is EXCEPTION_ACCESS_VIOLATION it means an
// attempt to read from the NULL pointer we passed in, so
// we handle the exception.
int filter(unsigned int code, struct _EXCEPTION_POINTERS *ep)
{
    if (code == EXCEPTION_ACCESS_VIOLATION)
    {
        printf_s("AV\n");
        return EXCEPTION_EXECUTE_HANDLER;
    }

    // If not what we were looking for, we don't want to handle it.
    return EXCEPTION_CONTINUE_SEARCH;
}

int main()
{
    int nRet = 0;

    __try
    {
        // Should return only if the first assignment is
        // optimized away.
        nRet = f(NULL);
        printf_s("Assignment was optimized away!\n");
    }
    __except(filter(GetExceptionCode(), GetExceptionInformation()))
    {
        // We get here if an Access violation occurred.
        printf_s("Access Violation: assignment was not optimized away.\n");
    }
}
Assignment was optimized away!
Did you find this helpful?
(1500 characters remaining)
Community Content Add
Annotations FAQ
Not much use on multi-core machines

_ReadWriteBarrier (and _ReadBarrier and _WriteBarrier) only stop the compiler from rearranging reads and writes. They do not stop the CPU from rearranging them. Therefore on multi-core machines they are not sufficient. On x86 and x64 processors you may be able to rely on the CPU's guarantees about reordering, but even x86/x64 CPUs do some rearranging of reads and writes, and Itanium and other processors do significant reordering.


It should be clearly documented that these are compiler barriers and that they do not prevent CPU reordering of reads and writes.

volatile is actually preferable because it will insert hardware memory barriers when needed.

The lack of hardware memory barrier instructions can be seen by compiling code that uses _ReadWriteBarrier and noticing the lack of memory barrier instructions. Even x86/x64 CPUs need a memory barrier instruction to implement a full _ReadWriteBarrier.

MemoryBarrier is the appropriate intrinsic to prevent CPU reordering of memory accesses.