Speed of Pascal CUDA8 1080Ti unified memory
问题 Thanks to the answers here yesterday, I think I now have a correct basic test of unified memory using Pascal 1080Ti. It allocates a 50GB single dimension array and adds it up. If I understand correctly, it should be memory bound since this test is so simple (adding integers). However, it takes 24 seconds equating to about 2GB/s. When I run the CUDA8 bandwidthTest I see higher rates: 11.7GB/s pinned and 8.5GB/s pageable. Is there any way to get the test to run faster than 24 seconds? Here's