double-precision

C++: difference between 0. and 0.0?

∥☆過路亽.° 提交于 2019-12-06 03:52:10
I am well aware of the difference between 0 and 0.0 (int and double). But is there any difference between 0. and 0.0 ( please note the . )? Thanks a lot in advance, Axel There is no difference. Both literals are double. From the C++-Grammar: fractional-constant: digit-sequenceopt . digit-sequence digit-sequence . See: Hyperlinked C++ BNF Grammar No, there is not. No. You can also write .0 as far as I know. Just having the . as part of the number identifies it as a floating point type. This: cout << (5 / 2) << endl; cout << (5. / 2) << endl; cout << (5.0 / 2) << endl; Prints this: 2 2.5 2.5 You

Java float is more precise than double?

情到浓时终转凉″ 提交于 2019-12-06 02:22:20
问题 Code: class Main { public static void main (String[] args) { System.out.print("float: "); System.out.println(1.35f-0.00026f); System.out.print("double: "); System.out.println(1.35-0.00026); } } Output: float: 1.34974 double: 1.3497400000000002 ??? float got the right answer, but double is adding extra stuff from no where, Why?? Isn't double supposed to be more precise than float? 回答1: A float is 4 bytes wide, whereas a double is 8 bytes wide. Check What Every Computer Scientist Should Know

Error due to limited precision of float and double

心已入冬 提交于 2019-12-05 21:05:01
In C++, I use the following code to work out the order of magnitude of the error due to the limited precision of float and double: float n=1; float dec = 1; while(n!=(n-dec)) { dec = dec/10; } cout << dec << endl; (in the double case all I do is exchange float with double in line 1 and 2) Now when I compile and run this using g++ on a Unix system, the results are Float 10^-8 Double 10^-17 However, when I compile and run it using MinGW on Windows 7, the results are Float 10^-20 Double 10^-20 What is the reason for this? I guess I'll make my comment an answer and expand on it. This is my

Is integer multiplication implemented using double precision floating point exact up until 2^53?

隐身守侯 提交于 2019-12-05 00:30:31
问题 I ask because I am computing matrix multiplications where all the matrix values are integers. I'd like to use LAPACK so that I get fast code that is correct. Will two large integers (whose product is less than 2^53 ), stored as double s, when multiplied, yield a double containing the exact integer result? 回答1: Your analysis is correct: All integers between -2 53 and 2 53 are exactly representable in double precision. The IEEE754 standard requires calculations to be performed exactly, and then

Does Fortran have inherent limitations on numerical accuracy compared to other languages?

假如想象 提交于 2019-12-05 00:24:16
问题 While working on a simple programming exercise, I produced a while loop (DO loop in Fortran) that was meant to exit when a real variable had reached a precise value. I noticed that due to the precision being used, the equality was never met and the loop became infinite. This is, of course, not unheard of and one is advised that, rather than comparing two numbers for equality, it is best see if the absolute difference between two numbers is less than a set threshold. What I found disappointing

double precision integer subtraction with 32-bit registers(MIPS)

强颜欢笑 提交于 2019-12-04 18:21:22
I am learning computer arithmetic. The book I use(Patterson and Hennessey) lists the below question. Write mips code to conduct double precision integer subtraction for 64-bit data. Assume the first operand to be in registers $t4(hi) and $t5(lo), second in $t6(hi) and $t7(lo). My solution to the answer is sub $t3, $t5, $t7 # Subtract lo parts of operands. t3 = t5 - t7 sltu $t2, $t5, $t7 # If the lo part of the 1st operand is less than the 2nd, # it means a borrow must be made from the hi part add $t6, $t6, $t2 # Simulate the borrow of the msb-of-low from lsb-of-high sub $t2, $t4, $t6 #

GCD algorithms for a large integers

跟風遠走 提交于 2019-12-04 13:56:05
I looking for the information about fast GCD computation algorithms. Especially, I would like to take a look at the realizations of that. The most interesting for me: - Lehmer GCD algorithm, - Accelerated GCD algorithm, - k-ary algorithm, - Knuth-Schonhage with FFT. I have completely NO information about the accelerated GCD algorithm, I just have seen a few articles where it was mentioned as the most effective and fast gcd computation method on the medium inputs (~1000 bits) They looks much difficult for me to understand from the theory view. Could anybody please share the code (preferable on

Java float is more precise than double?

强颜欢笑 提交于 2019-12-04 08:41:59
Code: class Main { public static void main (String[] args) { System.out.print("float: "); System.out.println(1.35f-0.00026f); System.out.print("double: "); System.out.println(1.35-0.00026); } } Output: float: 1.34974 double: 1.3497400000000002 ??? float got the right answer, but double is adding extra stuff from no where, Why?? Isn't double supposed to be more precise than float? A float is 4 bytes wide, whereas a double is 8 bytes wide. Check What Every Computer Scientist Should Know About Floating-Point Arithmetic Surely the double has more precision so it has slightly less rounding error.

Data type mismatch in fortran

左心房为你撑大大i 提交于 2019-12-04 06:04:28
问题 I've written a rudimentary algorithm in Fortran 95 to calculate the gradient of a function (an example of which is prescribed in the code) using central differences augmented with a procedure known as Richardson extrapolation. function f(n,x) ! The scalar multivariable function to be differentiated integer :: n real(kind = kind(1d0)) :: x(n), f f = x(1)**5.d0 + cos(x(2)) + log(x(3)) - sqrt(x(4)) end function f !=====! !=====! !=====! program gradient !=========================================

Why is this bearing calculation so inacurate?

我与影子孤独终老i 提交于 2019-12-04 03:23:35
Is it even that inaccurate? I re-implented the whole thing with Apfloat arbitrary precision and it made no difference which I should have known to start with!! public static double bearing(LatLng latLng1, LatLng latLng2) { double deltaLong = toRadians(latLng2.longitude - latLng1.longitude); double lat1 = toRadians(latLng1.latitude); double lat2 = toRadians(latLng2.latitude); double y = sin(deltaLong) * cos(lat2); double x = cos(lat1) * sin(lat2) - sin(lat1) * cos(lat2) * cos(deltaLong); double result = toDegrees(atan2(y, x)); return (result + 360.0) % 360.0; } @Test public void testBearing() {