floating-point-precision

What is a good way to round double-precision values to a (somewhat) lower precision?

夙愿已清 提交于 2019-12-07 00:45:29
问题 My problem is that I have to use a thrid-party function/algorithm which takes an array of double -precision values as input, but apparently can be sensitive to very small changes in the input data. However for my application I have to get identical results for inputs that are (almost) identical! In particular I have two test input arrays which are identical up to the 5-th position after the decimal point and still I get different results. So what causes the "problem" must be after the 5-th

OpenCL speed and float point precision

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-06 04:05:21
I have just started working with OpenCL. However, I have found some weird behavior of OpenCl, which i can't understand. The source i built and tested, was http://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism . I have a ATI Radeon HD 4770, and a AMD Fx 6200 3.8 ghz 6 core cpu. Speed Firstly the speed is not linearly to the number of maximum work group items. I ran App profiler to analyze the time spent during the kernel execution. The result was a bit shocking, my GPU which can only handle 256 work items per group, used 2.23008 milliseconds to calculate square of

simple floating-point numbers lose precision

别等时光非礼了梦想. 提交于 2019-12-06 03:42:46
问题 I'm using Delphi XE2 Update 3. There are precision issue with even the simplest of floating-point numbers (like 3.7 ). Given this code (a 32-bit console app): program Project1; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils; var s: Single; d: Double; x: Extended; begin Write('Size of Single ----- '); Writeln(SizeOf(Single)); Write('Size of Double ----- '); Writeln(SizeOf(Double)); Write('Size of Extended --- '); Writeln(SizeOf(Extended)); Writeln; s := 3.7; d := 3.7; x := 3.7; Write('"s"

Print __float128, without using quadmath_snprintf

老子叫甜甜 提交于 2019-12-06 03:15:57
In my question about Analysis of float/double precision in 32 decimal digits , one answer said to take a look at __float128 . I used it and the compiler could find it, but I can not print it, since the complier can not find the header quadmath.h . So my questions are: __float128 is standard, correct? How to print it? Isn't quadmath.h standard? These answers did not help: Use extern C Precision in C++ Printing The ref also did not help. Note that I do not want to use any non standard library. [EDIT] It would be also useful, if that question had an answer, even if the answer was a negative one.

Java float is more precise than double?

情到浓时终转凉″ 提交于 2019-12-06 02:22:20
问题 Code: class Main { public static void main (String[] args) { System.out.print("float: "); System.out.println(1.35f-0.00026f); System.out.print("double: "); System.out.println(1.35-0.00026); } } Output: float: 1.34974 double: 1.3497400000000002 ??? float got the right answer, but double is adding extra stuff from no where, Why?? Isn't double supposed to be more precise than float? 回答1: A float is 4 bytes wide, whereas a double is 8 bytes wide. Check What Every Computer Scientist Should Know

Why does g++ (4.6 and 4.7) promote the result of this division to a double? Can I stop it?

↘锁芯ラ 提交于 2019-12-05 13:59:24
I was writing some templated code to benchmark a numeric algorithm using both floats and doubles, in order to compare against a GPU implementation. I discovered that my floating point code was slower and after investigating using Vtune Amplifier from Intel I discovered that g++ was generating extra x86 instructions (cvtps2pd/cvtpd2ps and unpcklps/unpcklpd) to convert some intermediate results from float to double and then back again. The performance degradation is almost 10% for this application. After compiling with the flag -Wdouble-promotion (which BTW is not included with -Wall or -Wextra)

Increasing floating point precision in Python

南楼画角 提交于 2019-12-05 07:35:23
I was working on a project to compute the Leibniz approximation for pi with the below code: def pi(precision): sign = True ret = 0 for i in range(1,precision+1): odd = 2 * i - 1 if sign: ret += 1.0 / odd else: ret -= 1.0 / odd sign = not sign return ret However, the output value was always was 12 digits long. How can I increase the precision (e.g. more digits) of the calculation? Does Python support more precise floating points, or will I have to use some external library? albusshin Try using Decimal . Read Arbitrary-precision elementary mathematical functions (Python) original for more

What is a good way to round double-precision values to a (somewhat) lower precision?

痴心易碎 提交于 2019-12-05 05:05:33
My problem is that I have to use a thrid-party function/algorithm which takes an array of double -precision values as input, but apparently can be sensitive to very small changes in the input data. However for my application I have to get identical results for inputs that are (almost) identical! In particular I have two test input arrays which are identical up to the 5-th position after the decimal point and still I get different results. So what causes the "problem" must be after the 5-th position after the decimal point. Now my idea was to round the input to a slightly lower precision in

Detect FPU rounding mode on a GPU

余生长醉 提交于 2019-12-04 20:14:34
I was delving into multi-precision arithmetics, and there is a nice fast class of algorithms, described in Jonathan Richard Shewchuk, "Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates", 1997, Discrete & Computational Geometry, pages: 305–363. However, these algorithms rely on the FPU using round-to-even tiebreaking. On CPU, it would be easy, one would just check or set the FPU state word and would be sure. However, there is no such instruction (yet?) for GPU programming. That is why I was wondering if there is a dependable way of detecting (not setting) the

Can cout alter variables somehow?

梦想的初衷 提交于 2019-12-04 10:45:26
问题 So I have a function that looks something like this: float function(){ float x = SomeValue; return x / SomeOtherValue; } At some point, this function overflows and returns a really large negative value. To try and track down exactly where this was happening, I added a cout statement so that the function looked like this: float function(){ float x = SomeValue; cout << x; return x / SomeOtherValue; } and it worked! Of course, I solved the problem altogether by using a double. But I'm curious as