Taking advantage of SSE and other CPU extensions

后端未结

关注

 5  481

南笙 2021-02-04 06:41

Theres are couple of places in my code base where the same operation is repeated a very large number of times for a large data set. In some cases it\'s taking a considerable tim

5条回答

忘了有多久 (楼主)

2021-02-04 07:12
For your second point there are several solutions as long as you can separate out the differences into different functions:
- plain old C function pointers
- dynamic linking (which generally relies on C function pointers)
- if you're using C++, having different classes that represent the support for different architectures and using virtual functions can help immensely with this.
Note that because you'd be relying on indirect function calls, the functions that abstract the different operations generally need to represent somewhat higher level functionality or you may lose whatever gains you get from the optimized instruction in the call overhead (in other words don't abstract the individual SSE operations - abstract the work you're doing).

Here's an example using function pointers:
```
typedef int (*scale_func_ptr)( int scalar, int* pData, int count);


int non_sse_scale( int scalar, int* pData, int count)
{
    // do whatever work needs done, without SSE so it'll work on older CPUs

    return 0;
}

int sse_scale( int scalar, in pData, int count)
{
    // equivalent code, but uses SSE

    return 0;
}


// at initialization

scale_func_ptr scale_func = non_sse_scale;

if (useSSE) {
    scale_func = sse_scale;
}


// now, when you want to do the work:

scale_func( 12, theData_ptr, 512);  // this will call the routine that tailored to SSE 
                                    // if the CPU supports it, otherwise calls the non-SSE
                                    // version of the function
```
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...