This documentation is archived and is not being maintained.

3DNow! Intrinsics

Microsoft Specific

This topic contains the 3DNow! intrinsics. For each intrinsic, the header file mm3dnow.h is required.

Additional information about 3DNow! intrinsics can be found at http://www.amd.com/devconn/3dsdk/msvc.html.

The following table lists the 3DNow! intrinsics alphabetically.

Intrinsic Use
_m_femms Clears the architectural state when switching between MMX and floating-point instructions.
_m_from_float Returns a 64-bit MMX value where the lower half is set to the floating-point, single-precision value from the source operand and the upper half is zero. There is no error return.
_m_pavgusb Calculates the rounded averages of eight unsigned 8-bit integer values.
_m_pf2id Converts packed floating-point, single-precision values to packed 32-bit integer values.
_m_pf2iw Converts packed floating-point, single-precision values to packed 16-bit signed integer values using truncation.
_m_pfacc Performs packed floating-point, single-precision accumulation.
_m_pfadd Performs packed floating-point, single-precision addition.
_m_pfcmpeq Compares packed floating-point, single-precision values to be equal and sets the corresponding return value to ones or zeros based on the result of the comparison.
_m_pfcmpge Compares the first packed floating-point, single-precision value to be greater than or equal to the second one and sets the corresponding return value to ones or zeros based on the result of the comparison.
_m_pfcmpgt Compares the first packed floating-point, single-precision value to be greater than the second one and sets the corresponding return value to ones or zeros based on the result of the comparison.
_m_pfmax Returns the larger of the two packed floating-point, single-precision values.
_m_pfmin Returns the smaller of the two packed floating-point, single-precision values.
_m_pfmul Performs packed floating-point, single-precision multiplication.
_m_pfnacc Performs packed floating-point, single-precision negative accumulation.
_m_pfpnacc Performs packed floating-point, single-precision positive-negative accumulation.
_m_pfrcp Performs scalar floating-point, low-precision reciprocal approximation.
_m_pfrcpit1 Performs the first intermediate step in the Newton-Raphson iteration to refine the reciprocal approximation produced by the _m_pfrcp intrinsic function.
_m_pfrcpit2 Performs the second and final step in the Newton-Raphson iteration to refine the reciprocal or reciprocal square root approximation produced by the _m_pfrcp or _m_pfsqrt intrinsic functions, respectively.
_m_pfrsqit1 Performs the first intermediate step in the Newton-Raphson iteration to refine the reciprocal square root approximation produced by _m_pfsqrt intrinsic function.
_m_pfrsqrt Performs scalar floating-point, low-precision reciprocal square root approximation.
_m_pfsub Performs packed floating-point, single-precision subtraction.
_m_pfsubr Performs packed floating-point, single-precision reverse subtraction.
_m_pi2fd Converts packed 32-bit integer values to packed floating-point, single-precision values.
_m_pi2fw Converts packed 16-bit signed integer values to packed floating-point, single-precision values.
_m_pmulhrw Multiplies four signed 16-bit integer values in the source operand by four signed 16-bit integer values in the destination operand.
_m_prefetch Loads a 32-byte cache line into L1 data cache and sets the cache line state to exclusive.
_m_prefetchw Loads a 32-byte cache line into L1 data cache and sets the cache line state to modified.
_m_pswapd Swaps upper and lower halves of the source operand.
_m_to_float Returns the floating-point, single-precision value from the lower half of the 64-bit MMX value in the source operand. There is no error return.

The compiler correctly ensures that an implict FEMMS is issued before any attempt to use the result of _m_to_float( ) operation.

void _m_femms( void );

Clears the architectural state when switching between MMX and floating-point instructions. The contents of MMX registers and floating-point stack are undefined after the function is executed.

__m64 _m_from_float(float f);

Returns a 64-bit MMX value where the lower half is set to the floating-point, single-precision value from the source operand and the upper half is zero. There is no error return.

__m64 _m_pavgusb( __m64 m1, __m64 m2 );

Calculates the rounded averages of eight unsigned 8-bit integer values. The _m_pavgusb function takes two 64-bit MMX values and returns a 64-bit MMX value. There is no error return.

__m64 _m_pf2id( __m64 m );

Converts packed floating-point, single-precision values to packed 32-bit integer values. The _m_pf2id function returns a 64-bit MMX value. There is no error return.

__m64 _m_pf2iw( __m64 m );

Converts packed floating-point, single-precision values to packed 16-bit signed integer values using truncation. The _m_pf2iw function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfacc( __m64 m1, __m64 m2 );

Performs packed floating-point, single-precision accumulation. The _m_pfacc function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfadd( __m64 m1, __m64 m2 );

Performs packed floating-point, single-precision addition. The _m_pfadd function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfcmpeq( __m64 m1, __m64 m2 );

Compares packed floating-point, single-precision values to be equal and sets the corresponding return value to ones or zeros based on the result of the comparison. The _m_pfcmpeq function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfcmpge( __m64 m1, __m64 m2 );

Compares the first packed floating-point, single-precision value to be greater than or equal to the second one and sets the corresponding return value to ones or zeros based on the result of the comparison. The _m_pfcmpge function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfcmpgt( __m64 m1, __m64 m2 );

Compares the first packed floating-point, single-precision value to be greater than the second one and sets the corresponding return value to ones or zeros based on the result of the comparison. The _m_pfcmpgt function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfmax( __m64 m1, __m64 m2 );

Returns the larger of the two packed floating-point, single-precision values. The _m_pfmax function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfmin( __m64 m1, __m64 m2 );

Returns the smaller of the two packed floating-point, single-precision values. The _m_pfmin function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfmul( __m64 m1, __m64 m2 );

Performs packed floating-point, single-precision multiplication. The _m_pfmul function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfnacc( __m64 m1, __m64 m2 );

Performs packed floating-point, single-precision negative accumulation. The _m_pfnacc function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfpnacc( __m64 m1, __m64 m2 );

Performs packed floating-point, single-precision positive-negative accumulation. The _m_pfpnacc function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfrcp( __m64 m );

Performs scalar floating-point, low-precision reciprocal approximation. The 14-bit accurate result is duplicated in both high and low halves of the return value. The _m_pfrcp function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfrcpit1( __m64 m1, __m64 m2 );

Performs the first intermediate step in the Newton-Raphson iteration to refine the reciprocal approximation produced by the _m_pfrcp intrinsic function. The _m_pfrcpit1 function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfrcpit2( __m64 m1, __m64 m2 );

Performs the second and final step in the Newton-Raphson iteration to refine the reciprocal or reciprocal square root approximation produced by the _m_pfrcp or _m_pfsqrt intrinsic functions, respectively. The _m_pfrcpit2 function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfrsqit1( __m64 m1, __m64 m2 );

Performs the first intermediate step in the Newton-Raphson iteration to refine the reciprocal square root approximation produced by _m_pfsqrt intrinsic function. The _m_pfrsqit1 function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfrsqrt( __m64 m );

Performs scalar floating-point, low-precision reciprocal square root approximation. The 15-bit accurate result is duplicated in both high and low halves of the return value. The _m_pfrsqrt function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfsub( __m64 m1, __m64 m2 );

Performs packed floating-point, single-precision subtraction. The _m_pfsub function returns a 64-bit MMX value. There is no error return.

__m64 _m_pfsubr( __m64 m1, __m64 m2 );

Performs packed floating-point, single-precision reverse subtraction. The _m_pfsubr function returns a 64-bit MMX value. There is no error return.

__m64 _m_pi2fd( __m64 m );

Converts packed 32-bit integer values to packed floating-point, single-precision values. The _m_pi2fd function returns a 64-bit MMX value. There is no error return.

__m64 _m_pi2fw( __m64 m );

Converts packed 16-bit signed integer values to packed floating-point, single-precision values. The _m_pi2fw function returns a 64-bit MMX value. There is no error return.

__m64 _m_pmulhrw( __m64 m1, __m64 m2 );

Multiplies four signed 16-bit integer values in the source operand by four signed 16-bit integer values in the destination operand. Each high-order 16-bit result is rounded. The _m_pmulhrw function returns a 64-bit MMX value. There is no error return.

void _m_prefetch( void* p );

Loads a 32-byte cache line into L1 data cache and sets the cache line state to exclusive. If the line is already in the cache or if a memory fault is detected, then the intrinsic function has no effect. The variable p specifies the address of the cache line to be loaded.

void _m_prefetchw( void* p );

Loads a 32-byte cache line into L1 data cache and sets the cache line state to modified. If the line is already in the cache or if a memory fault is detected, then the intrinsic function has no effect. The variable p specifies the address of the cache line to be loaded.

__m64 _m_pswapd( __m64 m );

Swaps upper and lower halves of the source operand and returns a 64-bit MMX value. There is no error return.

float _m_to_float( __m64 m );

Returns the floating-point, single-precision value from the lower half of the 64-bit MMX value in the source operand. There is no error return.

The compiler correctly ensures that an implict FEMMS is issued before any attempt to use the result of _m_to_float( ) operation.

END Microsoft Specific

See Also

AMD 3DNow! Technology Overview and Intrinsics

Show: