The SSE and SSE2 instructions exist on various Intel and AMD processors. The AVX instructions exist on Intel Sandy Bridge processors and AMD Bulldozer processors.
The _M_IX86_FP and __AVX__ macros indicate which, if any, /arch compiler option was used. For more information, see Predefined Macros.
The optimizer chooses when and how to use the SSE and SSE2 instructions when /arch is specified. It uses SSE and SSE2 instructions for some scalar floating-point computations when it determines that it is faster to use the SSE/SSE2 instructions and registers instead of the x87 floating-point register stack. As a result, your code will actually use a mixture of both x87 and SSE/SSE2 for floating-point computations. Also, with /arch:SSE2, SSE2 instructions can be used for some 64-bit integer operations.
In addition to using the SSE and SSE2 instructions, the compiler also uses other instructions that are present on the processor revisions that support SSE and SSE2. An example is the CMOV instruction that first appeared on the Pentium Pro revision of the Intel processors.
Because the x86 compiler generates code that uses SSE2 instructions by default, you must specify /arch:IA32 to disable generation of SSE and SSE2 instructions for x86 processors.
When you use /clr to compile, /arch has no effect on code generation for managed functions. /arch only affects code generation for native functions.
/arch and /QIfist cannot be used on the same compiland. In particular, if you do not use _controlfp to modify the FP control word, then the run-time startup code sets the x87 FPU control word precision-control field to 53-bits. Therefore, every float and double operation in an expression uses a 53-bit significand and a 15-bit exponent. However, every SSE single-precision operation uses a 24-bit significand and an 8-bit exponent, and SSE2 double-precision operations use a 53-bit significand and an 11-bit exponent. For more information, see _control87, _controlfp, __control87_2. These differences are possible in one expression tree, but not in cases where a user assignment is involved after each subexpression. Consider the following:
r = f1 * f2 + d; // Different results are possible on SSE/SSE2.
t = f1 * f2; // Do f1 * f2, round to the type of t. r = t + d; // This should produce the same overall result // whether x87 stack is used or SSE/SSE2 is used.
To set this compiler option for AVX, IA32, SSE or SSE2 in Visual Studio
Open the Property Pages dialog box for the project. For more information, see How to: Open Project Property Pages.
Select the C/C++ folder.
Select the Code Generation property page.
Modify the Enable Enhanced Instruction Set property.