可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

In my main.cpp I have an excerpt:

Ptr<FastFeatureDetector> fastDetector = FastFeatureDetector::create(80, true);  while (true) {     Mat image = // get grayscale image 1280x720      timer.start();     detector->detect(image, keypoints);     myfile << "FAST\t" << timer.end() << endl; // timer.end() is how many seconds elapsed since last timer.start()       keypoints.clear();      timer.start();     for (int i = 3; i < image.rows - 3; i++)     {         for (int j = 3; j < image.cols - 3; j++)         {             if (inspectPoint(image.data, image.cols, i, j)) {                 // this block is never entered                 KeyPoint keypoint(i, j, 3);                 keypoints.push_back(keypoint);             }         }     }     myfile << "Custom\t" << timer.end() << endl;     myfile << endl;     myfile.flush();     ... }

myfile is saying:

FAST    0.000515495 Custom  0.00221361  FAST    0.000485697 Custom  0.00217653  FAST    0.000490001 Custom  0.00219044  FAST    0.000484373 Custom  0.00216329  FAST    0.000561184 Custom  0.00233214

So one would expect that inspectPoint() is a function that is actually doing something.

bool inspectPoint(const uchar* img, int cols, int i, int j) {     uchar p = img[i * cols + j];     uchar pt = img[(i - 3)*cols + j];     uchar pr = img[i*cols + j + 3];     uchar pb = img[(i + 3)*cols + j];     uchar pl = img[i*cols + j - 3];      return cols < pt - pr + pb - pl + i; // just random check so that the optimizer doesn't skip any calculations }

I am using Visual Studio 2013 and the optimization is set to "Full Optimization (/Ox)".

As far as I know, FAST algorithm goes through all pixels? I suppose it is not possible that it actually processes each pixel faster than the function inspectPoint().

How is FAST detector so fast? Or rather, why is the nested loop so slow?

回答1:

From a quick browsing of the source code it looks like there is extensive optimization for SSE and OpenCL in fastFeatureDetector:

SSE and OpenCL are not specific to any CPU. SSE utilizes the CPU's ability to perform a single instruction (calculation) on multiple pieces of data simultaneously. So depending on the CPU's architecture this can improve speeds as little as 2x or well beyond 4x. OpenCL can utilize the GPU which can also give major performance boosts to certain image processing operations.

文章来源: OpenCV FAST detector

标签

fast