faster implementation of sum ( for Codility test )

前端 未结 22 2092
鱼传尺愫
鱼传尺愫 2021-02-04 11:47

How can the following simple implementation of sum be faster?

private long sum( int [] a, int begin, int end ) {
    if( a == null   ) {
        ret         


        
22条回答
  •  北恋
    北恋 (楼主)
    2021-02-04 12:44

    Probably the fastest you could get would be to have your int array 16-byte aligned, stream 32 bytes into two __m128i variables (VC++) and call _mm_add_epi32 (again, a VC++ intrinsic) on the chunks. Reuse one of the chunks to keep adding into it and on the final chunk extract your four ints and add them the old fashioned way.

    The bigger question is why simple addition is a worthy candidate for optimization.

    Edit: I see it's mostly an academic exercise. Perhaps I'll give it a go tomorrow and post some results...

提交回复
热议问题