_mm_dp_pd
Microsoft Specific
Emits the Streaming SIMD Extensions 4 (SSE4) instruction dppd. This instruction computes the dot product of double precision floating point values.
__m128d _mm_dp_pd( __m128d a, __m128d b, const int mask );
The immediate bits 4-5 of mask determine which of the corresponding source operand pairs are to be multiplied. Bits 0-1 determine whether the dot product result will be written. If a mask bit is 0, the corresponding product result or written value is +0.0.
r0, a0, and b0 are the lowest 64 bits of return value r and parameters a and b, respectively. r1, a1, and b1 are the highest 64 bits of return value r and parameters a and b, respectively.
maski is bit i of parameter mask, where bit 0 is the least significant bit.
Before you use this intrinsic, software must ensure that the underlying processor supports the instruction.
#include <stdio.h>
#include <smmintrin.h>
int main ()
{
__m128d a, b;
const int mask = 0x31;
a.m128d_f64[0] = 1.5;
a.m128d_f64[1] = 10.25;
b.m128d_f64[0] = -1.5;
b.m128d_f64[1] = 3.125;
__m128d res = _mm_dp_pd(a, b, mask);
printf_s("Original a: %I64f\t%I64f\nOriginal b: %I64f\t%I64f\n",
a.m128d_f64[0], a.m128d_f64[1], b.m128d_f64[0], b.m128d_f64[1]);
printf_s("Result res: %I64f\t%I64f\n",
res.m128d_f64[0], res.m128d_f64[1]);
return 0;
}