Java threading optimization at 100% CPU usage

前端 未结 4 2109
灰色年华
灰色年华 2021-02-11 02:52

I have an application that accepts work on a queue and then spins that work off to be completed on independent threads. The number of threads is not massive, say up to 100, but

相关标签:
4条回答
  • 2021-02-11 03:21

    To get the most work done the quickest: am I best off to just launch more threads when I need to do more work and let the Java thread scheduler handle distributing the work, or would getting smarter and managing the work load to keep the CPU below 100% get me further faster?

    As you add more and more threads the overhead incurred in the context-switching, memory cache flushing, memory cache overflowing, and kernel and JVM thread management increases. As your threads hog the CPU their kernel priorities drop to some minimum and they will reach the time-slice minimum. As more and more threads crowd memory, they overflow the various internal CPU memory caches. There is a higher chance the CPU will need to swap the job in from slower memory. Internal to the JVM there is more mutex local contention and probably some (maybe small) incremental per-thread and object bandwidth GC overhead. Depending on how synchronized your user-tasks are, more threads would cause increased memory flushing and lock contention.

    With any program and any architecture, there is a sweet spot where threads can optimally utilize the available processor and IO resources while limiting kernel and JVM overhead. Finding that sweet spot repeatedly will require a number of iterations and some guesswork.

    I would recommend using the Executors.newFixedThreadPool(SOME_NUMBER); and submit you jobs to it. Then you can do multiple runs varying the number of threads up and down until you find the optimal number of pools running simultaneously according to the work and the architecture of the box.

    Understand however, that the optimal number of threads will vary based on how many processors and other factors that may be non-trivial to determine. More threads may be needed if they are blocking on disk or network IO resources. Fewer threads if the work they are doing is mostly CPU based.

    0 讨论(0)
  • 2021-02-11 03:23

    If you have too many simultaneous compute-intensive tasks in parallel threads, you reach the point of diminishing returns very quickly. In fact, if there are N processors (cores), then you don't want more than N such threads. Now, if the tasks occasionally pause for I/O or user interaction, then the right number can be somewhat larger. But in general, if at any one moment there are more threads that want to do computation than there are cores available, then your program is wasting time on context switches -- i.e., the scheduling is costing you.

    0 讨论(0)
  • 2021-02-11 03:26

    The fact that your CPU is running at 100% does not tell much about how busy they are doing useful work. In your case, you are using more threads than cores so the 100% includes some context switching and uses memory unnecessarily (small impact for 100 threads), which is sub-optimal.

    For CPU intensive task, I generally use this idiom:

    private final int NUM_THREADS = Runtime.getRuntime().availableProcessors() + 1;
    private final ExecutorService executor = Executors.newFixedThreadPool(NUM_THREADS);
    

    Using more threads, as others have indicated, only introduces unnecessary context switching.

    Obviously if the tasks do some I/O and other blocking operations, this is not applicable and a larger pool would make sense.

    EDIT

    To reply to @MartinJames comment, I have run a (simplistic) benchmark - result shows that going from a pool size = number of processors + 1 to 100 degrades the performance only slightly (let's call it 5%) - going to higher figures (1000 and 10000) does hit the performance significantly.

    Results are the average of 10 runs:
    Pool size: 9: 238 ms. //(NUM_CORES+1)
    Pool size: 100: 245 ms.
    Pool size: 1000: 319 ms.
    Pool size: 10000: 2482 ms.

    Code:

    public class Test {
    
        private final static int NUM_CORES = Runtime.getRuntime().availableProcessors();
        private static long count;
        private static Runnable r = new Runnable() {
    
            @Override
            public void run() {
                int count = 0;
                for (int i = 0; i < 100_000; i++) {
                    count += i;
                }
                Test.count += count;
            }
        };
    
        public static void main(String[] args) throws Exception {
            //warmup
            runWith(10);
    
            //test
            runWith(NUM_CORES + 1);
            runWith(100);
            runWith(1000);
            runWith(10000);
        }
    
        private static void runWith(int poolSize) throws InterruptedException {
            long average = 0;
            for (int run = 0; run < 10; run++) { //run 10 times and take the average
                Test.count = 0;
                ExecutorService executor = Executors.newFixedThreadPool(poolSize);
                long start = System.nanoTime();
                for (int i = 0; i < 50000; i++) {
                    executor.submit(r);
                }
                executor.shutdown();
                executor.awaitTermination(10, TimeUnit.SECONDS);
                long end = System.nanoTime();
                average += ((end - start) / 1000000);
                System.gc();
            }
            System.out.println("Pool size: " + poolSize + ": " + average / 10 + " ms.  ");
        }
    }
    
    0 讨论(0)
  • 2021-02-11 03:28

    'Would getting smarter and managing the work load to keep the CPU below 100% get me further faster?'

    Probably not.

    As others have posted, 100 threads is too many for a threadpool if most of the tasks are CPU-intensive. It won't make much difference to performance on typical systems - with that much overload it will be bad with 4 threads and bad with 400.

    How did you decide on 100 threads? Why not 16, say?

    'The number of threads is not massive, say up to 100' - does it vary? Just create 16 at startup and stop managing them - just pass the queue to them and forget about them.

    Horrible thought - you aren't creating a new thread for each task, are you?

    0 讨论(0)
提交回复
热议问题