floating-point-precision

What is the relationship between digits of significance and precision loss in floating point numbers?

◇◆丶佛笑我妖孽 提交于 2019-12-13 13:10:01
问题 So I have been trying to wrap by head around the relation between the number of significant digits in a floating point number and the relative loss of precision, but I just can't seem to make sense of it. I was reading an article earlier that said to do the following: Set a float to a value of 2147483647. You will see that its value is actually 2147483648 Subtract 64 from the float and you will see that the operation is correct Subtract 65 from the float and you will see that you actually now

Newton Raphson iteration trapped in infinite loop

允我心安 提交于 2019-12-13 00:39:01
问题 I'm quite a beginner in this topic, and couldn't find out the reason: sometimes the program works, sometimes not (after asking the question, it simply doensn't want to take in my answers, than I can write in as much as I want, it doesn't respond, just list out the numbers, I tiped in) #include <stdio.h> float abszolut (float szam) { float abszoluterteke; if (szam >=0) abszoluterteke = szam; else abszoluterteke = -szam; return abszoluterteke; } float negyzetgyok (float szam) { float pontossag

release mode uses double precision even for float variables

此生再无相见时 提交于 2019-12-12 14:59:57
问题 My algorithm is calculating the epsilon for single precision floating point arithmetic. It is supposed to be something around 1.1921e-007. Here is the code: static void Main(string[] args) { // start with some small magic number float a = 0.000000000000000013877787807814457f; for (; ; ) { // add the small a to 1 float temp = 1f + a; // break, if a + 1 really is > '1' if (temp - 1f != 0f) break; // otherwise a is too small -> increase it a *= 2f; Console.Out.WriteLine("current increment: " + a

Why does scipy.stats.nanmean give different result from numpy.nansum?

情到浓时终转凉″ 提交于 2019-12-12 11:29:14
问题 >>> import numpy as np >>> from scipy import stats >>> a = np.r_[1., 2., np.nan, 4., 5.] >>> stats.nanmean(a) 2.9999999999999996 >>> np.nansum(a)/np.sum(~np.isnan(a)) 3.0 I'm aware of the limitation of floating point representation. Just curious why the more clumsy expression seems to give "better" result. 回答1: First of all, here is scipy.nanmean() so that we know what we're comparing to: def nanmean(x, axis=0): x, axis = _chk_asarray(x,axis) x = x.copy() Norig = x.shape[axis] factor = 1.0-np

How to round a Python Decimal to 2 decimal places?

断了今生、忘了曾经 提交于 2019-12-11 19:27:23
问题 I've got a python Decimal (a currency amount) which I want to round to two decimal places. I tried doing this using the regular round() function. Unfortunately, this returns a float, which makes it unreliable to continue with: >>> from decimal import Decimal >>> a = Decimal('1.23456789') >>> type(round(a, 2)) <type 'float'> in the decimal module, I see a couple things in relation to rounding: ROUND_05UP ROUND_CEILING ROUND_DOWN ROUND_FLOOR ROUND_HALF_DOWN ROUND_HALF_EVEN ROUND_HALF_UP ROUND

Does “System.out.println( DECIMAL )” always output the same DECIMAL in the code?

丶灬走出姿态 提交于 2019-12-11 13:59:51
问题 Of course, System.out.println( 0.1 ); outputs 0.1 . But is it always true for an arbitrary decimal? (EXCLUDE cases which result from the precision of double number itself. Such as, System.out.println( 0.10000000000000000001); outputs 0.1 ) When I hit System.out.println( DECIMAL ); I think, DECIMAL is converted into binary(double) and that binary is converted into decimal (to output decimal as String) Think about the following conversion. decimal[D1] -> (CONVERSION1) -> binary[B] ->

Multiprecision Python library that plays well with boost::multiprecision or other options?

我的梦境 提交于 2019-12-11 13:14:07
问题 I am working on a project that revolves around multiprecision "complex" numbers, specifically it's a Mandelbrot Set-based app, but with a twist that requires decent correspondence between the output of a (fast) C++ py extension module (boost, cython, or other...) and the pure python modules that might want to use it. Right now, I'm using boost::multiprecision to wrap the MPFR raw type, and yeah if I just wanted to pass an mpfr_t to python that'd be one thing. However, for this app I need to

Causing underflow in ieee-754 floating point format using subtraction

风格不统一 提交于 2019-12-11 09:48:00
问题 This seems basic but I am having a lot of trouble answering the following question: Give two numbers X and Y represented in the IEEE754 format such that computing X-Y will result in underflow. To my understanding every operation can potentially result in underflow but for the life of mine I cant find an example for subtraction. PLEASE HELP!!! thanks 回答1: When default exception handling is in effect, a subtraction that produces a tiny (in the subnormal interval 1 ) non-zero result conceptually

Converting float to UInt32 - which expression is more precise

邮差的信 提交于 2019-12-10 17:38:41
问题 I have a number float x which should be in <0,1> range but it undergo several numerical operations - the result may be slightly outside <0,1>. I need to convert this result to uint y using entire range of UInt32 . Of course, I need to clamp x in the <0,1> range and scale it. But which order of operations is better? y = (uint)round(min(max(x, 0.0F), 1.0F) * UInt32.MaxValue) or y = (uint)round(min(max(x * UInt32.MaxValue, 0.0F), UInt32.MaxValue) In another words, it is better to scale first,

Midpoint 'rounding' when dealing with large numbers?

狂风中的少年 提交于 2019-12-10 17:09:28
问题 So I was trying to understand JavaScript's behavior when dealing with large numbers. Consider the following (tested in Firefox and Chrome): console.log(9007199254740993) // 9007199254740992 console.log(9007199254740994) // 9007199254740994 console.log(9007199254740995) // 9007199254740996 console.log(9007199254740996) // 9007199254740996 console.log(9007199254740997) // 9007199254740996 console.log(9007199254740998) // 9007199254740998 console.log(9007199254740999) // 9007199254741000 Now, I