问题
I am trying to find the bottleneck in my OpenCL kernel, is it possible to profile OpenCL programms on mac os X? I found gDebugger on http://www.gremedy.com/, but it requires 10.5 or 10.6 to run. AMD SDK supports only Linux and Windows.
Is there a profiler for Mountain Lion?
回答1:
How detailed must your profiling information be?
Is it okay to use the built-in internal profiler?
OpenCL queues can be created with the CL_QUEUE_PROFILING_ENABLE flag.
This way you can see for each kernel you executed:
When it has been
- Enqueued
- Submitted to you OCL-Device
- Started
- Ended
With C++-Bindings, the creation of the queue can look like this:
_queue = new cl::CommandQueue(_context, _device, CL_QUEUE_PROFILING_ENABLE );
The extration of the profiling information looks like this:
1) Save the event object (in an array) delivered by the enqueued kernel you want to profile.
cl::Event evt;
_queue->enqueueNDRangeKernel( _kernel, cl::NullRange, _range, cl::NullRange, NULL, &evt);
2) After execution of the queue, extract the profiling information
std::vector<cl::Event> evts;
//add all events to this vector here
//cl::Event evt;
//_queue->enqueueNDRangeKernel( _kernel, cl::NullRange, _range, cl::NullRange, NULL, &evt);
//evts.push_back(evt);
uint64_t param;
for (unsigned int i=0; i<evts.size(); i++)
{
evts[i].getProfilingInfo(CL_PROFILING_COMMAND_QUEUED, ¶m);
printf("%u: %llu", i, param);
evts[i].getProfilingInfo(CL_PROFILING_COMMAND_SUBMIT, ¶m);
printf(" %llu", param);
evts[i].getProfilingInfo(CL_PROFILING_COMMAND_START, ¶m);
printf(" %llu", param);
evts[i].getProfilingInfo(CL_PROFILING_COMMAND_END, ¶m);
printf(" %llu\n", param);
}
来源:https://stackoverflow.com/questions/12771567/is-there-an-opencl-profiler-for-mac-os-x-10-8