CUDA Blocks & Warps

前端 未结 3 990
孤独总比滥情好
孤独总比滥情好 2021-02-01 19:36

Ok I know that related questions have been asked over and over again and I read pretty much everything I found about this, but things are still unclear. Probably also because I

3条回答
  •  广开言路
    2021-02-01 20:08

    One of the concepts that took a whle to sink in, for me, is the efficiency of the hardware support for context-switching on the CUDA chip.

    Consequently, a context-switch occurs on every memory access, allowing calculations to proceed for many contexts alternately while the others wait on theri memory accesses. ne of the ways that GPGPU architectures achieve performance is the ability to parallelize this way, in addition to parallelizing on the multiples cores.

    Best performance is achieved when no core is ever waiting on a memory access, and is achieved by having just enough contexts to ensure this happens.

提交回复
热议问题