I tried the matlab\'s convolution function conv2 convn with gpuArray. For example convn(gpuArray.rand(100,100,10,\'single\'),gpuArray.rand(5,\'single\') and compared it to the c
The GPU performance is limited by the data array size [100x100x10] and [5x5] in your test case.
The actual performance also depends on the GPU and CPU module type. For your data size (test case 2 of the following code), I can get a performance improvement (2.75x) on GPU Tesla M2090 and CPU Xeon E5-2609.
For the following matlab test code
m=1000;
n=100;
k=5;
gc=convn(gpuArray.rand(m,m,10,'single'),gpuArray.rand(k,'single'));
tic;
for i=1:n
gc=convn(gpuArray.rand(m,m,10,'single'),gpuArray.rand(k,'single'));
end
toc
c=convn(rand(m,m,10,'single'),rand(k,'single'));
tic;
for i=1:n
c=convn(rand(m,m,10,'single'),rand(k,'single'));
end
toc
When m=1000; n=100; k=5;
I got very good performance improvement (11.6x) on GPU.
Elapsed time is 2.367453 seconds.
Elapsed time is 27.502952 seconds.
But when m=100; n=1000; k=5;
I got only 2.75x
Elapsed time is 1.206053 seconds.
Elapsed time is 3.330559 seconds.
When m=100; n=1000; k=14;
, it becomes better (4.84x).
Elapsed time is 2.804957 seconds.
Elapsed time is 13.585698 seconds.