Efficiency of Java “Double Brace Initialization”?

前端 未结 15 2113
清歌不尽
清歌不尽 2020-11-21 15:35

In Hidden Features of Java the top answer mentions Double Brace Initialization, with a very enticing syntax:

Set flavors = new HashSet         


        
相关标签:
15条回答
  • 2020-11-21 16:22

    leak prone

    I've decided to chime in. The performance impact includes: disk operation + unzip (for jar), class verification, perm-gen space (for Sun's Hotspot JVM). However, worst of all: it's leak prone. You can't simply return.

    Set<String> getFlavors(){
      return Collections.unmodifiableSet(flavors)
    }
    

    So if the set escapes to any other part loaded by a different classloader and a reference is kept there, the entire tree of classes+classloader will be leaked. To avoid that, a copy to HashMap is necessary, new LinkedHashSet(new ArrayList(){{add("xxx);add("yyy");}}). Not so cute any more. I don't use the idiom, myself, instead it is like new LinkedHashSet(Arrays.asList("xxx","YYY"));

    0 讨论(0)
  • 2020-11-21 16:23

    Mario Gleichman describes how to use Java 1.5 generic functions to simulate Scala List literals, though sadly you wind up with immutable Lists.

    He defines this class:

    package literal;
    
    public class collection {
        public static <T> List<T> List(T...elems){
            return Arrays.asList( elems );
        }
    }
    

    and uses it thusly:

    import static literal.collection.List;
    import static system.io.*;
    
    public class CollectionDemo {
        public void demoList(){
            List<String> slist = List( "a", "b", "c" );
            List<Integer> iList = List( 1, 2, 3 );
            for( String elem : List( "a", "java", "list" ) )
                System.out.println( elem );
        }
    }
    

    Google Collections, now part of Guava supports a similar idea for list construction. In this interview, Jared Levy says:

    [...] the most heavily-used features, which appear in almost every Java class I write, are static methods that reduce the number of repetitive keystrokes in your Java code. It's so convenient being able to enter commands like the following:

    Map<OneClassWithALongName, AnotherClassWithALongName> = Maps.newHashMap();

    List<String> animals = Lists.immutableList("cat", "dog", "horse");

    7/10/2014: If only it could be as simple as Python's:

    animals = ['cat', 'dog', 'horse']

    2/21/2020: In Java 11 you can now say:

    animals = List.of(“cat”, “dog”, “horse”)

    0 讨论(0)
  • 2020-11-21 16:24

    Every time someone uses double brace initialisation, a kitten gets killed.

    Apart from the syntax being rather unusual and not really idiomatic (taste is debatable, of course), you are unnecessarily creating two significant problems in your application, which I've just recently blogged about in more detail here.

    1. You're creating way too many anonymous classes

    Each time you use double brace initialisation a new class is made. E.g. this example:

    Map source = new HashMap(){{
        put("firstName", "John");
        put("lastName", "Smith");
        put("organizations", new HashMap(){{
            put("0", new HashMap(){{
                put("id", "1234");
            }});
            put("abc", new HashMap(){{
                put("id", "5678");
            }});
        }});
    }};
    

    ... will produce these classes:

    Test$1$1$1.class
    Test$1$1$2.class
    Test$1$1.class
    Test$1.class
    Test.class
    

    That's quite a bit of overhead for your classloader - for nothing! Of course it won't take much initialisation time if you do it once. But if you do this 20'000 times throughout your enterprise application... all that heap memory just for a bit of "syntax sugar"?

    2. You're potentially creating a memory leak!

    If you take the above code and return that map from a method, callers of that method might be unsuspectingly holding on to very heavy resources that cannot be garbage collected. Consider the following example:

    public class ReallyHeavyObject {
    
        // Just to illustrate...
        private int[] tonsOfValues;
        private Resource[] tonsOfResources;
    
        // This method almost does nothing
        public Map quickHarmlessMethod() {
            Map source = new HashMap(){{
                put("firstName", "John");
                put("lastName", "Smith");
                put("organizations", new HashMap(){{
                    put("0", new HashMap(){{
                        put("id", "1234");
                    }});
                    put("abc", new HashMap(){{
                        put("id", "5678");
                    }});
                }});
            }};
    
            return source;
        }
    }
    

    The returned Map will now contain a reference to the enclosing instance of ReallyHeavyObject. You probably don't want to risk that:

    Memory Leak Right Here

    Image from http://blog.jooq.org/2014/12/08/dont-be-clever-the-double-curly-braces-anti-pattern/

    3. You can pretend that Java has map literals

    To answer your actual question, people have been using this syntax to pretend that Java has something like map literals, similar to the existing array literals:

    String[] array = { "John", "Doe" };
    Map map = new HashMap() {{ put("John", "Doe"); }};
    

    Some people may find this syntactically stimulating.

    0 讨论(0)
  • 2020-11-21 16:28

    Taking the following test class:

    public class Test {
      public void test() {
        Set<String> flavors = new HashSet<String>() {{
            add("vanilla");
            add("strawberry");
            add("chocolate");
            add("butter pecan");
        }};
      }
    }
    

    and then decompiling the class file, I see:

    public class Test {
      public void test() {
        java.util.Set flavors = new HashSet() {
    
          final Test this$0;
    
          {
            this$0 = Test.this;
            super();
            add("vanilla");
            add("strawberry");
            add("chocolate");
            add("butter pecan");
          }
        };
      }
    }
    

    This doesn't look terribly inefficient to me. If I were worried about performance for something like this, I'd profile it. And your question #2 is answered by the above code: You're inside an implicit constructor (and instance initializer) for your inner class, so "this" refers to this inner class.

    Yes, this syntax is obscure, but a comment can clarify obscure syntax usage. To clarify the syntax, most people are familiar with a static initializer block (JLS 8.7 Static Initializers):

    public class Sample1 {
        private static final String someVar;
        static {
            String temp = null;
            ..... // block of code setting temp
            someVar = temp;
        }
    }
    

    You can also use a similar syntax (without the word "static") for constructor usage (JLS 8.6 Instance Initializers), although I have never seen this used in production code. This is much less commonly known.

    public class Sample2 {
        private final String someVar;
    
        // This is an instance initializer
        {
            String temp = null;
            ..... // block of code setting temp
            someVar = temp;
        }
    }
    

    If you don't have a default constructor, then the block of code between { and } is turned into a constructor by the compiler. With this in mind, unravel the double brace code:

    public void test() {
      Set<String> flavors = new HashSet<String>() {
          {
            add("vanilla");
            add("strawberry");
            add("chocolate");
            add("butter pecan");
          }
      };
    }
    

    The block of code between the inner-most braces is turned into a constructor by the compiler. The outer-most braces delimit the anonymous inner class. To take this the final step of making everything non-anonymous:

    public void test() {
      Set<String> flavors = new MyHashSet();
    }
    
    class MyHashSet extends HashSet<String>() {
        public MyHashSet() {
            add("vanilla");
            add("strawberry");
            add("chocolate");
            add("butter pecan");
        }
    }
    

    For initialization purposes, I'd say there is no overhead whatsoever (or so small that it can be neglected). However, every use of flavors will go not against HashSet but instead against MyHashSet. There is probably a small (and quite possibly negligible) overhead to this. But again, before I worried about it, I would profile it.

    Again, to your question #2, the above code is the logical and explicit equivalent of double brace initialization, and it makes it obvious where "this" refers: To the inner class that extends HashSet.

    If you have questions about the details of instance initializers, check out the details in the JLS documentation.

    0 讨论(0)
  • 2020-11-21 16:28

    There's generally nothing particularly inefficient about it. It doesn't generally matter to the JVM that you've made a subclass and added a constructor to it-- that's a normal, everyday thing to do in an object-oriented language. I can think of quite contrived cases where you could cause an inefficiency by doing this (e.g. you have a repeatedly-called method that ends up taking a mixture of different classes because of this subclass, whereas ordinary the class passed in would be totally predictable-- in the latter case, the JIT compiler could make optimisations that are not feasible in the first). But really, I think the cases where it'll matter are very contrived.

    I'd see the issue more from the point of view of whether you want to "clutter things up" with lots of anonymous classes. As a rough guide, consider using the idiom no more than you'd use, say, anonymous classes for event handlers.

    In (2), you're inside the constructor of an object, so "this" refers to the object you're constructing. That's no different to any other constructor.

    As for (3), that really depends on who's maintaining your code, I guess. If you don't know this in advance, then a benchmark that I would suggest using is "do you see this in the source code to the JDK?" (in this case, I don't recall seeing many anonymous initialisers, and certainly not in cases where that's the only content of the anonymous class). In most moderately sized projects, I'd argue you're really going to need your programmers to understand the JDK source at some point or other, so any syntax or idiom used there is "fair game". Beyond that, I'd say, train people on that syntax if you have control of who's maintaining the code, else comment or avoid.

    0 讨论(0)
  • 2020-11-21 16:29

    Here's the problem when I get too carried away with anonymous inner classes:

    2009/05/27  16:35             1,602 DemoApp2$1.class
    2009/05/27  16:35             1,976 DemoApp2$10.class
    2009/05/27  16:35             1,919 DemoApp2$11.class
    2009/05/27  16:35             2,404 DemoApp2$12.class
    2009/05/27  16:35             1,197 DemoApp2$13.class
    
    /* snip */
    
    2009/05/27  16:35             1,953 DemoApp2$30.class
    2009/05/27  16:35             1,910 DemoApp2$31.class
    2009/05/27  16:35             2,007 DemoApp2$32.class
    2009/05/27  16:35               926 DemoApp2$33$1$1.class
    2009/05/27  16:35             4,104 DemoApp2$33$1.class
    2009/05/27  16:35             2,849 DemoApp2$33.class
    2009/05/27  16:35               926 DemoApp2$34$1$1.class
    2009/05/27  16:35             4,234 DemoApp2$34$1.class
    2009/05/27  16:35             2,849 DemoApp2$34.class
    
    /* snip */
    
    2009/05/27  16:35               614 DemoApp2$40.class
    2009/05/27  16:35             2,344 DemoApp2$5.class
    2009/05/27  16:35             1,551 DemoApp2$6.class
    2009/05/27  16:35             1,604 DemoApp2$7.class
    2009/05/27  16:35             1,809 DemoApp2$8.class
    2009/05/27  16:35             2,022 DemoApp2$9.class
    

    These are all classes which were generated when I was making a simple application, and used copious amounts of anonymous inner classes -- each class will be compiled into a separate class file.

    The "double brace initialization", as already mentioned, is an anonymous inner class with an instance initialization block, which means that a new class is created for each "initialization", all for the purpose of usually making a single object.

    Considering that the Java Virtual Machine will need to read all those classes when using them, that can lead to some time in the bytecode verfication process and such. Not to mention the increase in the needed disk space in order to store all those class files.

    It seems as if there is a bit of overhead when utilizing double-brace initialization, so it's probably not such a good idea to go too overboard with it. But as Eddie has noted in the comments, it's not possible to be absolutely sure of the impact.


    Just for reference, double brace initialization is the following:

    List<String> list = new ArrayList<String>() {{
        add("Hello");
        add("World!");
    }};
    

    It looks like a "hidden" feature of Java, but it is just a rewrite of:

    List<String> list = new ArrayList<String>() {
    
        // Instance initialization block
        {
            add("Hello");
            add("World!");
        }
    };
    

    So it's basically a instance initialization block that is part of an anonymous inner class.


    Joshua Bloch's Collection Literals proposal for Project Coin was along the lines of:

    List<Integer> intList = [1, 2, 3, 4];
    
    Set<String> strSet = {"Apple", "Banana", "Cactus"};
    
    Map<String, Integer> truthMap = { "answer" : 42 };
    

    Sadly, it didn't make its way into neither Java 7 nor 8 and was shelved indefinitely.


    Experiment

    Here's the simple experiment I've tested -- make 1000 ArrayLists with the elements "Hello" and "World!" added to them via the add method, using the two methods:

    Method 1: Double Brace Initialization

    List<String> l = new ArrayList<String>() {{
      add("Hello");
      add("World!");
    }};
    

    Method 2: Instantiate an ArrayList and add

    List<String> l = new ArrayList<String>();
    l.add("Hello");
    l.add("World!");
    

    I created a simple program to write out a Java source file to perform 1000 initializations using the two methods:

    Test 1:

    class Test1 {
      public static void main(String[] s) {
        long st = System.currentTimeMillis();
    
        List<String> l0 = new ArrayList<String>() {{
          add("Hello");
          add("World!");
        }};
    
        List<String> l1 = new ArrayList<String>() {{
          add("Hello");
          add("World!");
        }};
    
        /* snip */
    
        List<String> l999 = new ArrayList<String>() {{
          add("Hello");
          add("World!");
        }};
    
        System.out.println(System.currentTimeMillis() - st);
      }
    }
    

    Test 2:

    class Test2 {
      public static void main(String[] s) {
        long st = System.currentTimeMillis();
    
        List<String> l0 = new ArrayList<String>();
        l0.add("Hello");
        l0.add("World!");
    
        List<String> l1 = new ArrayList<String>();
        l1.add("Hello");
        l1.add("World!");
    
        /* snip */
    
        List<String> l999 = new ArrayList<String>();
        l999.add("Hello");
        l999.add("World!");
    
        System.out.println(System.currentTimeMillis() - st);
      }
    }
    

    Please note, that the elapsed time to initialize the 1000 ArrayLists and the 1000 anonymous inner classes extending ArrayList is checked using the System.currentTimeMillis, so the timer does not have a very high resolution. On my Windows system, the resolution is around 15-16 milliseconds.

    The results for 10 runs of the two tests were the following:

    Test1 Times (ms)           Test2 Times (ms)
    ----------------           ----------------
               187                          0
               203                          0
               203                          0
               188                          0
               188                          0
               187                          0
               203                          0
               188                          0
               188                          0
               203                          0
    

    As can be seen, the double brace initialization has a noticeable execution time of around 190 ms.

    Meanwhile, the ArrayList initialization execution time came out to be 0 ms. Of course, the timer resolution should be taken into account, but it is likely to be under 15 ms.

    So, there seems to be a noticeable difference in the execution time of the two methods. It does appear that there is indeed some overhead in the two initialization methods.

    And yes, there were 1000 .class files generated by compiling the Test1 double brace initialization test program.

    0 讨论(0)
提交回复
热议问题