Process and Thread Function ...


SetProcessAffinityMask Function

Sets a processor affinity mask for the threads of the specified process.

Syntax

C++
BOOL WINAPI SetProcessAffinityMask(
  __in  HANDLE hProcess,
  __in  DWORD_PTR dwProcessAffinityMask
);

Parameters

hProcess [in]

A handle to the process whose affinity mask is to be set. This handle must have the PROCESS_SET_INFORMATION access right. For more information, see Process Security and Access Rights.

dwProcessAffinityMask [in]

The affinity mask for the threads of the process.

On a system with more than 64 processors, the affinity mask must specify processors in a single processor group.

Return Value

If the function succeeds, the return value is nonzero.

If the function fails, the return value is zero. To get extended error information, call GetLastError.

If the process affinity mask requests a processor that is not configured in the system, the last error code is ERROR_INVALID_PARAMETER.

On a system with more than 64 processors, if the calling process contains threads in more than one processor group, the last error code is ERROR_INVALID_PARAMETER.

Remarks

A process affinity mask is a bit vector in which each bit represents a logical processor on which the threads of the process are allowed to run. The value of the process affinity mask must be a subset of the system affinity mask values obtained by the GetProcessAffinityMask function. A process is only allowed to run on the processors configured into a system. Therefore, the process affinity mask cannot specify a 1 bit for a processor when the system affinity mask specifies a 0 bit for that processor.

Process affinity is inherited by any child process or newly instantiated local process.

Do not call SetProcessAffinityMask in a DLL that may be called by processes other than your own.

On a system with more than 64 processors, the SetProcessAffinityMask function can be used to set the process affinity mask only for processes with threads in a single processor group. Use the SetThreadAffinityMask function to set the affinity mask for individual threads in multiple groups. This effectively changes the group assignment of the process.

Requirements

Minimum supported client

Windows 2000 Professional

Minimum supported server

Windows 2000 Server

Header

Winbase.h (include Windows.h)

Library

Kernel32.lib

DLL

Kernel32.dll

See Also

CreateProcess
GetProcessAffinityMask
Multiple Processors
Process and Thread Functions
Processes
Processor Groups
SetThreadAffinityMask

 

 

Send comments about this topic to Microsoft

Build date: 2/4/2010

Tags :


Community Content

jkriegshauser
How to set affinity only to your real processors when Hyper-Thread is on

Note from jkriegshauser: This code sample does not work as expected on Multi-core processors (i.e. Core2 Duo/Quad, etc). The reason is because the newer multi-core processors define the HTT flag as "hardware multithreading" and the logical processor count will include the cores even though Core2 Duo processors do not have HTT. See the following Intel publications:

Intel Processor Identification and the CPUID Instruction: http://www.intel.com/assets/pdf/appnote/241618.pdf

Intel 64 and IA-32 Architectures Software Developer's Manual Vol 3A (section 7.10.2): http://www.intel.com/design/processor/manuals/253668.pdf


Also, the original MSDN article mentioned in this comment can be found here: http://msdn.microsoft.com/en-us/magazine/cc300701.aspx


The following code sample comes from "Juice Up Your App with the Power of Hyper-Threading", an MSDN article:

public void SetProcessAffinityToPhysicalCPUForHyperthreadOnly(int processid)
{
int res;
int hProcess;
int ProcAffinityMask = 0, SysAffinityMask = 0;
hProcess = OpenProcess(PROCESS_ALL_ACCESS, 0, processid);
res = GetProcessAffinityMask(
hProcess, ref ProcAffinityMask, ref SysAffinityMask);
if (SysAffinityMask == 3) // 1 proc, 2 logical CPUs
res = SetProcessAffinityMask(hProcess, 1);
else if (SysAffinityMask == 15) //dual proc, 4 virtual CPUs
res = SetProcessAffinityMask(hProcess, 3);
res = CloseHandle(hProcess);
}

From the sample above, we see that the affinity mask is such that all physical processors come first in the mask, then the logical (Hyper-threaded) processors.

If your process is a heavy user of floating point instructions, setting the affinity mask to (number of processors/2) - 1 will make sure your threads will give preference for the physical processors which have FPU.

For a sample on how to detect if Hyper-Thread is on in C/C++, you could use this:

__inline BOOL hyperThreadingOn()
{
DWORD rEbx, rEdx;
__asm {
push eax // save registers used
push ebx
push ecx
push edx
xor eax,eax // cpuid(1)
add al, 0x01
_emit 0x0F
_emit 0xA2
mov rEdx, edx // Features Flags, bit 28 indicates if HTT (Hyper-Thread Technology) is
// available, but not if it is on; if on, Count of logical processors > 1.
mov rEbx, ebx // Bits 23-16: Count of logical processors.
// Valid only if Hyper-Threading Technology flag is set.
pop edx // restore registers used
pop ecx
pop ebx
pop eax
}
return (rEdx & (1<<28)) && (((rEbx & 0x00FF0000) >> 16) > 1);
}


GreenCat
The above information should not be used under any circumstances.

The above information should not be used under any circumstances. There are 4 problems with the implementation itself, as well as a few problems with the concept as a whole.

Implementation problems:

1. With Hyper-Threading, each physical processor contains two logical processors. There is no distinction between a "physical" and "logical" processor. There are no special bits that correspond to a "physical" processor. There are only logical processors. It is not possible to "prefer a physical processor over a logical one" because that simply makes no sense. Each logical processor on the same physical chip shares the same FPU and provides access to the same resources. Even the title makes no sense... every logical processor is part of a real processor, you can't "prefer your real processors" because they are all your real processors. The example above fundamentally does not make sense. It will gain you nothing more than limiting a process to run on two completely arbitrary logical CPU's (which may or may not even be on the same physical CPU). It just doesn't work. Don't use it. This is the biggest problem.

2. Which CPUs correspond to which bits in the affinity mask is not guaranteed by the Win32 API at all, and should not be relied on (e.g. it is never correct to assume that a certain processor type comes first in the bit mask).

3. The example logic fails on machines without 2 or 4 logical CPU's (e.g., an 8-core Xeon server).

4. The Hyper-Threading detection "algorithm" is not "C or C++". It's MSVC-specific inline assembler. Another compiler, for example, GCC, would not recognize it as-is. And the _emit's are highly eyebrow-raising.

5. All multi-core CPUs will set bit 28 of the feature information bits.

Conceptual problems:

1. Unless you have complete control over the system, and over the other software on the system, and over how the user is running your particular piece of software, you are far more likely to hurt performance by setting the affinity mask yourself. The above poster is incorrect in stating that a thread "prefers" a certain logical processor. In fact, this function forces the threads to run on those processors, no matter what. While the threads will be able to run on any of the CPU's in the process affinity mask, this is still inviting problems. For example, if another piece of software has also unwisely set it's affinity mask to the same logical CPU(s) as your software does, you're S.O.L. as you've bypassed the kernel's ability to make wise judgments about which CPU to run your application on, and forced your application to share busy CPU's with another process (e.g. a quad core system where two applications are forced to run on the same two cores... not too great). Also, what if you set the affinity mask, and the user runs your application twice? Again, bypassing the kernel's CPU scheduler forces your application to share busy CPU's with other processes while other CPU's sit idle.

2. Again, unless you have complete control over the system, making the assumption that the system you are on has a certain number of cores is never a good idea. At best, if you must set affinity masks, you can get information about which logical processors reside on the same physical processor using NUMA functions like GetNumaNodeProcessorMask and make judgments from there (rather than using unreliable hacks like assuming certain bits correspond to certain processors). At the very least, you can get the system affinity mask and count the bits if you want to count the number of logical processors and make decisions based on that. Ignoring the fact that the above technique fundamentally makes no sense (as mentioned above), even if it did make sense, it only "works" (I use the term loosely) in 2 very specific situations: HT CPU with 2 logical processors, or HT CPU with 4 logical processors. In all other cases, if HT is not present, or if there are not 2 or 4 logical processors, the affinity mask is not set. Inconsistent logic like that is never a recipe for optimization.

3. For the vast majority of applications, the kernel will always do a better job than you at making judgments about what logical CPU's to schedule a process and it's threads on. By not setting the affinity masks at all, you immediately make the best use of all available CPU's on any system, regardless of system configuration or other running applications. On the other hand, by setting the affinity masks, you lock yourself into very specific CPU configurations, and risk forcing yourself into far-less-than-optimal situations on machines with other processes running on them or with moderately complex layouts (e.g. Windows automatically schedules to logical CPU's on separate physical CPU's first in layouts where that makes sense, and in general always is assumed to schedule threads as optimally as possible given the system layout).

While there are certainly some valid reasons for setting the process affinity mask, the "technique" given above is fundamentally flawed in both concept and implementation, and should be disregarded.

For those of you interested in actually learning more about the concept of physical and logical processors, multi-threading in general, and a bit about Hyper-threading before attempting such "optimizations", here is a decent article about what goes on under the hood: http://arstechnica.com/articles/paedia/cpu/hyperthreading.ars

I also suggest actually reading the MSDN article linked to in the above post. It's a great article. Note how it correctly describes the difference between physical and logical processors. Also note that the kernel takes the system configuration into account and makes optimal decisions for you (e.g. by attempting to schedule processes on logical CPU's that reside on different physical CPU's first -- and doing that only in HT environments where it is appropriate).

JC


dmex
vb.net syntax
<DllImport("kernel32.dll", CharSet:=CharSet.Auto, SetLastError:=True)> Public Shared Function SetProcessAffinityMask(ByVal handle As SafeProcessHandle, ByVal mask As IntPtr) As Boolean
End Function
Tags :

dmex
C# syntax
[DllImport("kernel32.dll", CharSet=CharSet.Auto, SetLastError=true)]
public static extern bool SetProcessAffinityMask(SafeProcessHandle handle, IntPtr mask);
Tags :

DednDave
Mask Bit Order and ASM Syntax

GreenCat states:
2. Which CPUs correspond to which bits in the affinity mask is not guaranteed by the Win32 API at all, and should not be relied on (e.g. it is never correct to assume that a certain processor type comes first in the bit mask).

This is not strictly correct. While MS does not provide documentation explaining how the masks are
created, there is a logical order to the mask core bits. The masks are created based on the tables
generated by BIOS at boot-time. What you may safely assume is that bits that correspond to the
cores of a specific physical processor are grouped together. For example, if bit 0 corresponds to
physical CPU package #0, any additional logical processors contained in that CPU package will be
represented in bit(s) 1, 2, 3, etc, until that package has been completely enumerated. This seems
to hold true for systems with 32 cores or less, as I have not had an opportunity to test code on a
system with more than 32 cores.

Altering affinity is neccessary when using RDTSC to read time-stamp counter values or using CPUID
to enumerate and identify processor cores on multi-core machines. It may also be neccessary when
reading or writing other MSR's that contain core-specific information. Upon completion of the desired
task, it is best to immediately restore affinity to its initial state.

ASM Syntax:
INVOKE SetProcessAffinityMask, hProcess, dwProcessAffinityMask



Tags :

Page view tracker