int vs float arithmetic efficiency in Java

问题

I'm writing an application that uses Dijkstra algorithm to find minimal paths in the graph. The weights of the nodes and edges in the graph are float numbers, so the algorithm doing many arithmetics on float numbers. Could I gain a running time improve if I convert all weight to ints? Is int arithmetic operations are faster in Java then float ones?

I tried to write a simple benchmark to check that out, but I'm not satisfied with the results I got. Possibly the compiler has optimized some parts of the program so the results doesn't looks good for me.

EDIT:

The problem I'm trying to solve is in the Information Retrieval field. The application should show answers to a query posed as a set of keywords.

My data structure is a weighted directed graph. Given a set of leaf nodes I have to find a smallest tree that connects these nodes and show the answer to the user. The weights are assigned by a weighting function based partially on the tf/idf technique. The user don't know what weights I assign to the nodes and edges he just wants to see answers relevant to the query he posed. So exact results are not required, just a possibility to enumerate answers according to theirs weights. Just the native use of weighting function (as I mentioned it is based on tf/idf) gives float weights so I used floats so far.

I hope this adds some background to the question.

回答1:

As ever with this sort of thing you should set yourself some performance goals, and then profile the app to see if it meets them.

Often times you may find surprising results; that the time taken is hardly affected by base numerical type at all, or that your algorithm is suboptimal.

And regarding compiler optimisations - they're a real, and valid part of performance optimisation.

If using type A is theoretically faster than using type B, but your compiler can optimise type B to be quicker in a real scenario then thats a valuable piece of evidence, not source for dissapointment.

回答2:

for simple operations int is faster, however with int you may have to do more work to get the same result. e.g.

as float

float f = 15 * 0.987;

as int

int i = 15 * 987 / 1000;

The extra division means the int operation can take longer.

回答3:

Integer subtractions are ~2.5 times faster than double subtractions, on my machine. Integer multiplications however, are only ~1.5 times faster than double multiplications.

The following test works on random data, which might prevent the compiler from optimizing.

// test whether int subs are faster than double subs
public void compareIntAndFloatSubtraction(){

    int N = 100000;  // input array size
    int k = 100000;  // number of mathematical operations performed on each element

    // generate random data
    int[] ints = new int[N];
    double[] doubles = new double[N];
    Random r = new Random(1l);
    for (int i = 0; i < N; i++) {
        ints[i] = r.nextInt();
        doubles[i] = r.nextDouble();
    }

    // measure integer subtractions
    long before = System.currentTimeMillis();
    for (int i = 1; i < N; i++) {
        for (int j = 0; j < k; j++) {
            ints[i] -= ints[i-1];  // referring to another element might prevent from optimization also
        }
    }
    System.out.println(String.format("time needed for int subs [ms]: %s", System.currentTimeMillis()-before));

    // measure double subtractions
    before = System.currentTimeMillis();
    for (int i = 1; i < N; i++) {
        for (int j = 0; j < k; j++) {
            doubles[i] -= doubles[i-1];
        }
    }
    System.out.println(String.format("time needed for double subs [ms]: %s", System.currentTimeMillis()-before));

}

回答4:

If you just want to compare weights, you should prefer int to float.

回答5:

Generally you should not worry about a choice between int and float for performance reasons.

Here's an excerpt from the Appendix of Java Puzzlers:

Floating-point arithmetic is inexact. Don't use floating-point where exact results are required; instead, use an integral type or BigDecimal. Prefer double to float.

Unless you have a really good reason, you should generally prefer double to float if you must use floating point operation. If exact result is desired, then go ahead and use BigDecimal; it'll be slower since it's not a primitive, but unless profiling shows that it's not acceptable, this is often the best option.

If you must use floating point operation, then trying to optimize this by using int is ill-advised. This is likely to be a premature optimization and will only complicate the code unnecessary. Write it in the most natural, most readable way. Do not complicate your code unnecessarily for the sake of slight performance gains.

If you don't actually need floating point operation, then by all means use int or long instead.

回答6:

I think the performance is very much dependent on the algorithm and the platform the software is running on.

If you're doing matrix / array calculations on an X86 platform the runtime might optimize it to use SSE, which is a float/double only extended instruction set.

On other platforms the runtime might optimize to OpenCL (I don't believe anyone does that right now, but it might happen:). I have no clue what runs fastest on such a platform, and under what conditions. It may just be that OpenCL is optimized for an integer workload.

Under these circumstances I would conclude that it is not useful to optimize the data type (float or int) at this point, and just optimize the readability of the code.

If your code is highly performance critical, and you know exactly on which hardware the system will be running now and in the future, you could test typical workloads with various algorithms and select the one which best meets your needs.

But in general, just use an algorithm you can understand, keep the code readable, and thereby the bug count low. Fast code isn't worth that much if the results are not correct :)