发表新帖

发表新帖

CUDA: Why are bitwise operators sometimes faster than logical operators?

前端未结

关注

 3  1072

When I am down to squeezing the last bit of performance out of a kernel, I usually find that replacing the logical operators (&& and

相关标签:

3条回答

北海茫月

2021-01-11 18:25
A && B:
```
if (!A) {
  return 0;
}
if (!B) {
  return 0;
}
return 1;
```
A & B:
```
return A & B;
```
These are the semantics considering that evaluating A and B can have side effects (they can be functions that alter the state of the system when evaluated).

There are many ways that the compiler can optimize the A && B case, depending on the types of A and B and the context.
0 讨论(0)
发布评论:

提交评论
- 加载中...
無奈伤痛

2021-01-11 18:28

Logical operators will often result in branches, particularly when the rules of short circuit evaluation need to be observed. For normal CPUs this can mean branch misprediction and for CUDA it can mean warp divergence. Bitwise operations do not require short circuit evaluation so the code flow is linear (i.e. branchless).

0 讨论(0)
发布评论:

提交评论
- 加载中...
时光说笑

2021-01-11 18:31

Bitwise operations can be carried out in registers at hardware level. Register operations are the fastest, this is specially true when the data can fit in the register. Logical operations involve expression evaluation which may not be register bound. Typically &, |, ^, >>... are some of the fastest operations and used widely in high performance logic.

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题