Best strategy for profiling memory usage of my code (open source) and 3rd party code(closed source)

后端 未结 7 1350
悲&欢浪女
悲&欢浪女 2020-12-15 12:07

I am soon going to be tasked with doing a proper memory profile of a code that is written in C/C++ and uses CUDA to take advantage of GPU processing.

My initial thou

相关标签:
7条回答
  • 2020-12-15 12:41

    I believe that this question has two very separate answers. One for C/C++ land. And a second for CUDA land.

    On the CPU:

    I've written my own replacements for new and delete. They were horribly slow and didn't help much. I've used totalview. I like totalview for OpenMP debugging, but I agree very slow for memory debugging. I've never tried valgrind. I've heard similar things.

    The only memory debugging tool which I've encountered worth its salt is Intel Parallel Inspector's Memory Checker. Note: As I'm a student, I was able to get an education license on the cheap. That said, it's amazing. It took me twelve minutes to find a memory leak buried in half a million lines of code -- I wasn't releasing a thrown error object which I caught and ignored. I like this one piece of software so much that when my raid failed / Win 7 ate my computer (think autoupdate & raid rebuild simultaneously), I stopped everything and rebuilt the computer because I knew it would take me less time to rebuild the dual boot (48 hours) than it would've to find the memory leak another way. If you don't believe my outlandish claims, download an evaluation version.

    On the GPU:

    I think you're out of luck. For all memory issues in CUDA, I've essentially had to home grow my own tools and wrappers around cudaMalloc etc. It isn't pretty. nSight does buy you something, but at this point, not much beyond just a "here's how much you've allocated riiiight now. And on that sad note, almost every performance issue I've had with CUDA was directly dependent on my memory access patterns (that or my thread block size).

    0 讨论(0)
提交回复
热议问题