/arch (Minimum CPU Architecture)
The compiler supports generation of code using the Streaming SIMD Extensions (SSE) and Streaming SIMD Extensions 2 (SSE2) instructions. The SSE instructions exist in various Pentium processors as well as in AMD Athlon processors. The SSE2 instructions currently only exists on the Pentium 4 processor.
For example, /arch:SSE allows the compiler to use the SSE instructions, and /arch:SSE2 allows the compiler to use the SSE2 instructions.
The optimizer will choose when and how to make use of the SSE and SSE2 instructions when /arch is specified. Currently SSE and SSE2 instructions will be used for some scalar floating-point computations, when it is determined that it is faster to use the SSE/SSE2 instructions and registers rather than the x87 floating-point register stack. As a result your code will actually use a mixture of both x87 and SSE/SSE2 for floating-point computations. Additionally, with /arch:SSE2, SSE2 instructions may be used for some 64-bit integer operations.
In addition to making use of the SSE and SSE2 instructions, the compiler will also make use of other instructions that are present on the processor revisions that support SSE and SSE2. An example of this is the CMOV instruction that first appeared in the PentiumPro revision of the Intel processors.
Specifying /arch with one of the /G options that specifies an older processor will be accepted without warning, but /G option will be silently ignored in favor of optimizing for the chip revision that corresponds to /arch. So, if /arch:SSE2 is specified with /G6, the compiler will optimize as if /G7 was specified. Similarly, if /arch:SSE is specified with /G5, the compiler optimize as if /G6 was specified.
When compiling with /clr, / arch will have no effect on code generation for managed functions; /arch only affects code generation for native functions.
/arch and /QIfist can not be used on the same compiland.
/Op in combination with /arch may in some cases provide different results than /Op without /arch. This is because with /Op alone individual expressions are evaluated on the x87 stack which can potentially mean a larger significand & exponent will be used than what is available in the SSE/SSE2 registers.
In particular if the user doesn't use _controlfp to modify the FP control word, the runtime startup code will set the x87 FPU control word precision-control field to 53-bits, so all float and double operations within an expression will occur with 53-bit significand and 15-bit exponent. All SSE single-precision operations will however use a 24-bit significand/8-bit exponent, and SSE2 double-precision operations will use a 53-bit significand/11-bit exponent.
To illustrate, these differences are possible within a single expression tree, not in cases where there is a user assignment involved after each sub-expression:
r = f1 * f2 + d; // different results possible on SSE/SSE2 even with /Op
t = f1 * f2; // do f1 * f2, round to the type of t r = t + d; // this should produce the same overall result // regardless of /Op and whether x87 stack or SSE/SSE2 // is used
controlfp does not change the MXCSR control bits, so with /arch:SSE2 any functionality that depends on using controlfp will be broken.
To set this compiler option in the Visual Studio development environment
- Open the project's Property Pages dialog box. For details, see Setting Visual C++ Project Properties.
- Click the C/C++ folder.
- Click the Code Generation property page.
- Modify the Enable Enhanced Instruction Set property.
To set this compiler option programmatically