Array allocation and access on the Java Virtual Machine and memory contention

后端 未结 2 1824
刺人心
刺人心 2021-01-31 11:07

Observe the following definition of a thread subclass (the entire runnable Java source file is included at the end of the question for your convenience):

final c         


        
2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-31 11:34

    I believe you need to reduce your code so its not doing lots of incidental things which could be confusing matters. After reducing the code it is clear to me that you are only accessing the same array location every time. i.e. position 512.

    If you minimise your code, reuse your threads so you are not stop/starting them you get much more reproducible results.

    import java.util.ArrayList;
    import java.util.Arrays;
    import java.util.List;
    import java.util.concurrent.ExecutionException;
    import java.util.concurrent.ExecutorService;
    import java.util.concurrent.Executors;
    import java.util.concurrent.Future;
    
    public class MultiStackJavaExperiment {
        static final int size = Integer.getInteger("size", 500000000);
    
        public static void main(String... args) throws ExecutionException, InterruptedException {
            int par = 8;
            for (int s = 64; s <= 64 * 1024; s *= 2) {
                int times = args.length == 0 ? 1 : Integer.parseInt(args[0]);
                long[] measurements = new long[times];
    
                ExecutorService es = Executors.newFixedThreadPool(par);
                List> futures = new ArrayList>(times);
                for (int i = 0; i < times; i++) {
                    long start = System.currentTimeMillis();
                    final int sz = size / par;
                    futures.clear();
                    for (int j = 0; j < par; j++) {
                        final Object[] arr = new Object[s];
                        futures.add(es.submit(new Runnable() {
                            @Override
                            public void run() {
                                final int bits = 7, arraySize = 1 << bits;
                                int i = 0;
                                int pos = 32;
                                Object v = new Object();
                                while (i < sz) {
                                    if (i % 2 == 0) {
                                        arr[pos] = v;
                                        pos += 1;
                                    } else {
                                        pos -= 1;
                                        v = arr[pos];
                                    }
                                    i++;
                                }
                            }
                        }));
                    }
                    for (Future future : futures)
                        future.get();
    
                    long time = System.currentTimeMillis() - start;
    //                System.out.println(i + ") Running time: " + time + " ms");
                    measurements[i] = time;
                }
                es.shutdown();
                System.out.println("par = " + par + " arr.length= "+ s  + " >>> All running times: " + Arrays.toString(measurements));
            }
        }
    }
    

    this shows the distance between access values matters. By allocating an array is each thread, you use different TLABs (which space out the data in blocks)

    par = 8 arr.length= 64 >>> All running times: [539, 413, 444, 444, 457, 444, 456]
    par = 8 arr.length= 256 >>> All running times: [398, 527, 514, 529, 445, 441, 445]
    par = 8 arr.length= 1024 >>> All running times: [419, 507, 477, 422, 412, 452, 396]
    par = 8 arr.length= 4096 >>> All running times: [316, 282, 250, 232, 242, 229, 238]
    par = 8 arr.length= 16384 >>> All running times: [316, 207, 209, 212, 208, 208, 208]
    par = 8 arr.length= 65536 >>> All running times: [211, 211, 208, 208, 208, 291, 206]
    par = 8 arr.length= 262144 >>> All running times: [366, 210, 210, 210, 210, 209, 211]
    par = 8 arr.length= 1048576 >>> All running times: [296, 211, 215, 216, 213, 211, 211]
    

    if you move the array inside the thread you get

    par = 8 arr.length= 64 >>> All running times: [225, 151, 151, 150, 152, 153, 152]
    par = 8 arr.length= 256 >>> All running times: [155, 151, 151, 151, 151, 151, 155]
    par = 8 arr.length= 1024 >>> All running times: [153, 152, 151, 151, 151, 155, 152]
    par = 8 arr.length= 4096 >>> All running times: [155, 156, 151, 152, 151, 155, 155]
    par = 8 arr.length= 16384 >>> All running times: [154, 157, 152, 152, 158, 153, 153]
    par = 8 arr.length= 65536 >>> All running times: [155, 157, 152, 184, 181, 154, 153]
    par = 8 arr.length= 262144 >>> All running times: [240, 159, 166, 151, 172, 154, 160]
    par = 8 arr.length= 1048576 >>> All running times: [165, 162, 163, 162, 163, 162, 163]
    

    Turn off the tlab with -XX:-UseTLAB and the same code give syou

    par = 8 arr.length= 64 >>> All running times: [608, 467, 467, 457, 468, 461, 428]
    par = 8 arr.length= 256 >>> All running times: [437, 437, 522, 512, 522, 369, 535]
    par = 8 arr.length= 1024 >>> All running times: [394, 395, 475, 525, 470, 440, 478]
    par = 8 arr.length= 4096 >>> All running times: [347, 215, 238, 226, 236, 204, 271]
    par = 8 arr.length= 16384 >>> All running times: [291, 157, 178, 151, 150, 151, 152]
    par = 8 arr.length= 65536 >>> All running times: [163, 152, 162, 151, 159, 159, 154]
    par = 8 arr.length= 262144 >>> All running times: [164, 172, 152, 169, 160, 161, 160]
    par = 8 arr.length= 1048576 >>> All running times: [295, 153, 164, 153, 166, 154, 163]
    

提交回复
热议问题