How come my array index is faster than pointer

后端未结

关注

 10  586

青春惊慌失措

Why the array index is faster than pointer? Isn\'t pointer supposed to be faster than array index?

** i used time.h clock_t to tested two functions, each loop 2 mil

相关标签:

10条回答

爱一瞬间的悲伤

2020-12-17 17:01

I would suggest running each loop 200 million times, and then run each loop 10 times, and take the fastest measurement. That will factor out effects from OS scheduling and so on.

I would then suggest you disassemble the code for each loop.

0 讨论(0)
发布评论:

提交评论
- 加载中...
隐瞒了意图╮

2020-12-17 17:03

It may be the comparison in the for loop that is causing the difference. The termination condition is tested on each iteration, and your "pointer" example has a slightly more complicated termination condition (taking the address of &a[size]). Since &a[size] does not change, you could try setting it to a variable to avoid recalculating it on each iteration of the loop.

0 讨论(0)
发布评论:

提交评论
- 加载中...

孤街浪徒

2020-12-17 17:03

Access the data through array index or pointer is exactly equivalent. Go through the below program with me...

There are a loop which continues to 100 times but when we see disassemble code that there are the data which we access through has least instruction comparability to access through array Index

But it doesn't mean that accessing data through pointer is fast actually it's depend on the instruction which performed by compiler.Both the pointer and array index used the address array access the value from offset and increment through it and pointer has address.

int a[100];
fun1(a,100);
fun2(&a[0],5);
}
void fun1(int a[],int n)
{
int i;
for(i=0;i<=99;i++)
{
a[i]=0;
printf("%d\n",a[i]);
}
}
void fun2(int *p,int n)
{
int i;
for(i=0;i<=99;i++)
{
*p=0;
printf("%d\n",*(p+i));
}
}


disass fun1
Dump of assembler code for function fun1:
   0x0804841a <+0>: push   %ebp
   0x0804841b <+1>: mov    %esp,%ebp
   0x0804841d <+3>: sub    $0x28,%esp`enter code here`
   0x08048420 <+6>: movl   $0x0,-0xc(%ebp)
   0x08048427 <+13>:    jmp    0x8048458 <fun1+62>
   0x08048429 <+15>:    mov    -0xc(%ebp),%eax
   0x0804842c <+18>:    shl    $0x2,%eax
   0x0804842f <+21>:    add    0x8(%ebp),%eax
   0x08048432 <+24>:    movl   $0x0,(%eax)
   0x08048438 <+30>:    mov    -0xc(%ebp),%eax
   0x0804843b <+33>:    shl    $0x2,%eax
   0x0804843e <+36>:    add    0x8(%ebp),%eax
   0x08048441 <+39>:    mov    (%eax),%edx
   0x08048443 <+41>:    mov    $0x8048570,%eax
   0x08048448 <+46>:    mov    %edx,0x4(%esp)
   0x0804844c <+50>:    mov    %eax,(%esp)
   0x0804844f <+53>:    call   0x8048300 <printf@plt>
   0x08048454 <+58>:    addl   $0x1,-0xc(%ebp)
   0x08048458 <+62>:    cmpl   $0x63,-0xc(%ebp)
   0x0804845c <+66>:    jle    0x8048429 <fun1+15>
   0x0804845e <+68>:    leave  
   0x0804845f <+69>:    ret    
End of assembler dump.
(gdb) disass fun2
Dump of assembler code for function fun2:
   0x08048460 <+0>: push   %ebp
   0x08048461 <+1>: mov    %esp,%ebp
   0x08048463 <+3>: sub    $0x28,%esp
   0x08048466 <+6>: movl   $0x0,-0xc(%ebp)
   0x0804846d <+13>:    jmp    0x8048498 <fun2+56>
   0x0804846f <+15>:    mov    0x8(%ebp),%eax
   0x08048472 <+18>:    movl   $0x0,(%eax)
   0x08048478 <+24>:    mov    -0xc(%ebp),%eax
   0x0804847b <+27>:    shl    $0x2,%eax
   0x0804847e <+30>:    add    0x8(%ebp),%eax
   0x08048481 <+33>:    mov    (%eax),%edx
   0x08048483 <+35>:    mov    $0x8048570,%eax
   0x08048488 <+40>:    mov    %edx,0x4(%esp)
   0x0804848c <+44>:    mov    %eax,(%esp)
   0x0804848f <+47>:    call   0x8048300 <printf@plt>
   0x08048494 <+52>:    addl   $0x1,-0xc(%ebp)
   0x08048498 <+56>:    cmpl   $0x63,-0xc(%ebp)
   0x0804849c <+60>:    jle    0x804846f <fun2+15>
   0x0804849e <+62>:    leave  
   0x0804849f <+63>:    ret    
End of assembler dump.
(gdb)

0 讨论(0)

再見小時候

2020-12-17 17:06
Oops, on my 64-bit system results are quite different. I've got that this
```
 int i;

 for(i = 0; i < size; i++)
 {
     *(a+i) = 0;
 }
```
is about 100 times !! slower than this
```
 int i;
 int * p = a;

 for(i = 0; i < size; i++)
 {
     *(p++) = 0;
 }
```
when compiling with -O3. This hints to me that somehow moving to next address is far easier to achieve for 64-bit cpu, than to calculate destination address from some offset. But i'm not sure.

EDIT:
This really has something related with 64-bit architecture because same code with same compile flags doesn't shows any real difference in performance on 32-bit system.
0 讨论(0)
发布评论:

提交评论
- 加载中...
爱一瞬间的悲伤

2020-12-17 17:06

This is a very hard thing to time, because compilers are very good at optimising these things. Still it's better to give the compiler as much information as possible, that's why in this case I'd advise using std::fill, and let the compiler choose.

But... If you want to get into the detail

a) CPU's normally give pointer+value for free, like : mov r1, r2(r3).
b) This means an index operation requires just : mul r3,r1,size
This is just one cycle extra, per loop.
c) CPU's often provide stall/delay slots, meaning you can often hide single-cycle operations.

All in all, even if your loops are very large, the cost of the access is nothing compared to the cost of even a few cache-misses. You are best advised to optimise your structures before you care about loop costs. Try for example, packing your structures to reduce the memory footprint first

0 讨论(0)
发布评论:

提交评论
- 加载中...
青春惊慌失措

2020-12-17 17:07

It looks like the index solution can save a few instructions with the compare in the for loop.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页