I know that devices before the Fermi architecture had 8 SPs in a single multiprocessor. Is the count same in Fermi architecture?
Update of @AshwinNanjappa's answer for CUDA 7.5:
Compute Capability | # Cores |
---|---|
1.x: | 8 |
2.0: | 32 |
2.1: | 48 |
3.x: | 192 |
5.x: | 128 |
Notes:
$CUDA_SAMPLES_DIR/common/inc/helper_cuda.h
.The answer depends on the Compute Capability property of the CUDA device. The numbers are:
See appendix G of the CUDA C Programming Guide.
The number of Multiprocessors (MP) and the number of cores per MP can be found by executing DeviceQuery.exe. It is found in the %NVSDKCOMPUTE_ROOT%/C/bin
directory of the GPU Computing SDK installation.
A look at the code of DeviceQuery (found in %NVSDKCOMPUTE_ROOT%/C/src/DeviceQuery
) reveals that it the number of cores is calculated by passing the x.y CUDA Capability numbers to the ConvertSMVer2Cores utility function.
From the code of ConvertSMVer2Cores this relationship between the capability and core count can be seen:
Capability | Cores |
---|---|
10 | 8 |
11 | 8 |
12 | 8 |
13 | 8 |
20 | 32 |
21 | 48 |