Java thread creation overhead

前端 未结 4 1498
走了就别回头了
走了就别回头了 2020-11-29 01:01

Conventional wisdom tells us that high-volume enterprise java applications should use thread pooling in preference to spawning new worker threads. The use of java.util

相关标签:
4条回答
  • 2020-11-29 01:29

    Here is an example microbenchmark:

    public class ThreadSpawningPerformanceTest {
    static long test(final int threadCount, final int workAmountPerThread) throws InterruptedException {
        Thread[] tt = new Thread[threadCount];
        final int[] aa = new int[tt.length];
        System.out.print("Creating "+tt.length+" Thread objects... ");
        long t0 = System.nanoTime(), t00 = t0;
        for (int i = 0; i < tt.length; i++) { 
            final int j = i;
            tt[i] = new Thread() {
                public void run() {
                    int k = j;
                    for (int l = 0; l < workAmountPerThread; l++) {
                        k += k*k+l;
                    }
                    aa[j] = k;
                }
            };
        }
        System.out.println(" Done in "+(System.nanoTime()-t0)*1E-6+" ms.");
        System.out.print("Starting "+tt.length+" threads with "+workAmountPerThread+" steps of work per thread... ");
        t0 = System.nanoTime();
        for (int i = 0; i < tt.length; i++) { 
            tt[i].start();
        }
        System.out.println(" Done in "+(System.nanoTime()-t0)*1E-6+" ms.");
        System.out.print("Joining "+tt.length+" threads... ");
        t0 = System.nanoTime();
        for (int i = 0; i < tt.length; i++) { 
            tt[i].join();
        }
        System.out.println(" Done in "+(System.nanoTime()-t0)*1E-6+" ms.");
        long totalTime = System.nanoTime()-t00;
        int checkSum = 0; //display checksum in order to give the JVM no chance to optimize out the contents of the run() method and possibly even thread creation
        for (int a : aa) {
            checkSum += a;
        }
        System.out.println("Checksum: "+checkSum);
        System.out.println("Total time: "+totalTime*1E-6+" ms");
        System.out.println();
        return totalTime;
    }
    
    public static void main(String[] kr) throws InterruptedException {
        int workAmount = 100000000;
        int[] threadCount = new int[]{1, 2, 10, 100, 1000, 10000, 100000};
        int trialCount = 2;
        long[][] time = new long[threadCount.length][trialCount];
        for (int j = 0; j < trialCount; j++) {
            for (int i = 0; i < threadCount.length; i++) {
                time[i][j] = test(threadCount[i], workAmount/threadCount[i]); 
            }
        }
        System.out.print("Number of threads ");
        for (long t : threadCount) {
            System.out.print("\t"+t);
        }
        System.out.println();
        for (int j = 0; j < trialCount; j++) {
            System.out.print((j+1)+". trial time (ms)");
            for (int i = 0; i < threadCount.length; i++) {
                System.out.print("\t"+Math.round(time[i][j]*1E-6));
            }
            System.out.println();
        }
    }
    }
    

    The results on 64-bit Windows 7 with 32-bit Sun's Java 1.6.0_21 Client VM on Intel Core2 Duo E6400 @2.13 GHz are as follows:

    Number of threads  1    2    10   100  1000 10000 100000
    1. trial time (ms) 346  181  179  191  286  1229  11308
    2. trial time (ms) 346  181  187  189  281  1224  10651
    

    Conclusions: Two threads do the work almost twice as fast as one, as expected since my computer has two cores. My computer can spawn nearly 10000 threads per second, i. e. thread creation overhead is 0.1 milliseconds. Hence, on such a machine, a couple of hundred new threads per second pose a negligible overhead (as can also be seen by comparing the numbers in the columns for 2 and 100 threads).

    0 讨论(0)
  • 2020-11-29 01:39

    First of all, this will of course depend very much on which JVM you use. The OS will also play an important role. Assuming the Sun JVM (Hm, do we still call it that?):

    One major factor is the stack memory allocated to each thread, which you can tune using the -Xssn JVM parameter - you'll want to use the lowest value you can get away with.

    And this is just a guess, but I think "a couple of hundred new threads every second" is definitely beyond what the JVM is designed to handle comfortably. I suspect that a simple benchmark will quickly reveal quite unsubtle problems.

    0 讨论(0)
  • 2020-11-29 01:40
    • for your benchmark you can use JMeter + a profiler, which should give you direct overview on the behaviour in such a heavy-loaded environment. Just let it run for a an hour and monitor memory, cpu, etc. If nothing breaks and the cpu(s) doesn't overheat, it's ok :)

    • perhaps you can get a thread-pool, or customize (extend) the one you are using by adding some code in order to have the appropriate InheritableThreadLocals set each time a Thread is acquired from the thread-pool. Each Thread has these package-private properties:

      /* ThreadLocal values pertaining to this thread. This map is maintained
       * by the ThreadLocal class. */
      ThreadLocal.ThreadLocalMap threadLocals = null;
      
      /*
       * InheritableThreadLocal values pertaining to this thread. This map is
       * maintained by the InheritableThreadLocal class.  
       */ 
      ThreadLocal.ThreadLocalMap inheritableThreadLocals = null;
      

      You can use these (well, with reflection) in combination with the Thread.currentThread() to have the desired behaviour. However this is a bit ad-hock, and furthermore, I can't tell whether it (with the reflection) won't introduce even bigger overhead than just creating the threads.

    0 讨论(0)
  • 2020-11-29 01:54

    I am wondering whether it is necessary to spawn new threads on each user request if their typical life-cycle is as short as a second. Could you use some kind of Notify/Wait queue where you spawn a given number of (daemon)threads, and they all wait until there's a task to solve. If the task queue gets long, you spawn additional threads, but not on a 1-1 ratio. It will most likely be perform better then spawning hundreds of new threads whose life-cycles are so short.

    0 讨论(0)
提交回复
热议问题