Precision in Sum reduction kernel with floats

后端未结

关注

 1  786

I am creating a routine that calls the Sum Reduction kernel of Nvidia (reduction6), but when I compare the results between the CPU and GPU get an error that increases as the vec

相关标签:

1条回答

慢半拍i

2021-01-22 12:31

Floating point addition is not necessarily associative.

This means that when you change the order of operations of your floating-point summation, you may get different results. Parallelizing a summation by definition changes the order of operations of the summation.

There are many ways to sum floating-point numbers, and each has accuracy benefits for different input distributions. Here's a decent survey.

Sequential summation in the given order is rarely the most accurate way to sum, so if that is what you are comparing against, don't expect it to compare well to the tree-based summation used in a typical parallel reduction.

0 讨论(0)
发布评论:

提交评论
- 加载中...