I know that devices before the Fermi architecture had 8 SPs in a single multiprocessor. Is the count same in Fermi architecture?
Update of @AshwinNanjappa's answer for CUDA 7.5:
Compute Capability | # Cores |
---|---|
1.x: | 8 |
2.0: | 32 |
2.1: | 48 |
3.x: | 192 |
5.x: | 128 |
Notes:
$CUDA_SAMPLES_DIR/common/inc/helper_cuda.h
.