Precision in Sum reduction kernel with floats

后端 未结 1 784
日久生厌
日久生厌 2021-01-22 12:13

I am creating a routine that calls the Sum Reduction kernel of Nvidia (reduction6), but when I compare the results between the CPU and GPU get an error that increases as the vec

相关标签:
1条回答
  • 2021-01-22 12:31

    Floating point addition is not necessarily associative.

    This means that when you change the order of operations of your floating-point summation, you may get different results. Parallelizing a summation by definition changes the order of operations of the summation.

    There are many ways to sum floating-point numbers, and each has accuracy benefits for different input distributions. Here's a decent survey.

    Sequential summation in the given order is rarely the most accurate way to sum, so if that is what you are comparing against, don't expect it to compare well to the tree-based summation used in a typical parallel reduction.

    0 讨论(0)
提交回复
热议问题