Difference between JVM's LookupSwitch and TableSwitch?

前端 未结 4 709
花落未央
花落未央 2020-11-27 11:05

I have some difficulty to understand LookUpSwitch and TableSwitch in Java bytecode.

If I understand well, both LookUpSwitch and TableSwitch correspond to the s

相关标签:
4条回答
  • 2020-11-27 11:47

    The difference is that

    • lookupswitch uses a table with keys and labels
    • tableswitch uses a table with labels only.

    When performing a tableswitch, the int value on top of stack is directly used as an index into the table to grab the jump destination and perform the jump immediately. The whole lookup+jump process is an O(1) operation, that means it's blazing fast.

    When performing a lookupswitch, the int value on top of the stack is compared against the keys in the table until a match is found and then the jump destination next to this key is used to perform the jump. Since a lookupswitch table always must be sorted so that keyX < keyY for every X < Y, the whole lookup+jump process is a O(log n) operation as the key will be searched using a binary search algorithm (it's not necessary to compare the int value against all possible keys to find a match or to determine that none of the keys matches). O(log n) is somewhat slower than O(1), yet it is still okay since many well known algorithms are O(log n) and these are usually considered fast; even O(n) or O(n * log n) is still considered a pretty good algorithm (slow/bad algorithms have O(n^2), O(n^3), or even worse).

    The decision which instruction to use is made by the compiler based upon the fact how compact the switch statement is, e.g.

    switch (inputValue) {
      case 1:  // ...
      case 2:  // ...
      case 3:  // ...
      default: // ...
    }
    

    The switch above is perfectly compact, it has no numeric "holes". The compiler will create a tableswitch like this:

     tableswitch 1 3
        OneLabel
        TwoLabel
        ThreeLabel
      default: DefaultLabel
    

    The pseudo code from the Jasmin page explains this pretty well:

    int val = pop();                // pop an int from the stack
    if (val < low || val > high) {  // if its less than <low> or greater than <high>,
        pc += default;              // branch to default 
    } else {                        // otherwise
        pc += table[val - low];     // branch to entry in table
    }
    

    This code is pretty clear on how such a tableswitch works. val is inputValue, low would be 1 (the lowest case value in the switch) and high would be 3 (the highest case value in the switch).

    Even with some holes a switch can be compact, e.g.

    switch (inputValue) {
      case 1:  // ...
      case 3:  // ...
      case 4:  // ...
      case 5:  // ...
      default: // ...
    }
    

    The switch above is "almost compact", it only has a single hole. A compiler could generate the following instruction:

     tableswitch 1 6
        OneLabel
        FakeTwoLabel
        ThreeLabel
        FourLabel
        FiveLabel
      default: DefaultLabel
    
      ; <...code left out...>
    
      FakeTwoLabel:
      DefaultLabel:
        ; default code
    

    As you can see, the compiler has to add a fake case for 2, FakeTwoLabel. Since 2 is no real value of the switch, FakeTwoLabel is in fact a label that changes code flow exactly where the default case is located, since a value of 2 should in fact execute the default case.

    So a switch doesn't have to be perfectly compact for the compiler to create a tableswitch, yet it should at least be pretty close to compactness. Now consider the following switch:

    switch (inputValue) {
      case 1:    // ...
      case 10:   // ...
      case 100:  // ...
      case 1000: // ...
      default:   // ...
    }
    

    This switch is nowhere near compactness, it has more than hundred times more holes than values. One would call this a sparse switch. The compiler would have to generate almost thousand fake cases to express this switch as a tableswitch. The result would be a huge table, blowing up the size of the class file dramatically. This is not practical. Instead it will generate a lookupswitch:

    lookupswitch
        1       : Label1
        10      : Label10
        100     : Label100
        1000    : Label1000
        default : DefaultLabel
    

    This table has only 5 entries, instead of over thousand ones. The table has 4 real values, O(log 4) is 2 (log is here log to the base of 2 BTW, not to the base of 10, since computer operate on binary numbers). That means it takes the VM at most two comparisons to find the label for the inputValue or to come to the conclusion, that the value is not in the table and thus the default value must be executed. Even if the table had 100 entries, it would take the VM at most 7 comparisons to find the correct label or decide to jump to the default label (and 7 comparisons is a lot less than 100 comparisons, don't you think?).

    So it's nonsense that these two instructions are interchangeable or that the reason for two instructions has historical reasons. There are two instructions for two different kind of situations, one for switches with compact values (for maximum speed) and one for switches with sparse values (not maximum speed, yet still good speed and very compact table representation regardless of the numeric holes).

    0 讨论(0)
  • 2020-11-27 11:58

    How javac 1.8.0_45 decides what to compile switch to?

    To decide when to use which, you could use the javac choice algorithm as basis.

    We know that the source of javac is in the langtools repo.

    Then we grep:

    hg grep -i tableswitch
    

    and the first result is langtools/src/share/classes/com/sun/tools/javac/jvm/Gen.java:

    // Determine whether to issue a tableswitch or a lookupswitch
    // instruction.
    long table_space_cost = 4 + ((long) hi - lo + 1); // words
    long table_time_cost = 3; // comparisons
    long lookup_space_cost = 3 + 2 * (long) nlabels;
    long lookup_time_cost = nlabels;
    int opcode =
        nlabels > 0 &&
        table_space_cost + 3 * table_time_cost <=
        lookup_space_cost + 3 * lookup_time_cost
        ?
        tableswitch : lookupswitch;
    

    Where:

    • hi: maximum case value
    • lo: minimum case value

    So we conclude that it takes into consideration both the time and space complexity, with a weight of 3 for the time complexity.

    TODO I don't understand why lookup_time_cost = nlabels and not log(nlabels), since a tableswitch could be done in O(log(n)) with binary search.

    Bonus fact: C++ compilers also make an analogous choice between an O(1) jump table and O(long(n)) binary search: Advantage of switch over if-else statement

    0 讨论(0)
  • 2020-11-27 12:08

    I suspect it is mostly historical, due to some specific binding of Java bytecode to underlying machine code (e.g. Sun's own CPU).

    The tableswitch is essentially a computed jump, where destination is taken from a lookup table. By contrast, lookupswitch requires comparison of each value, basically an iteration trough table elements until matching value is found.

    Obviously those opcodes are interchangeable, but based on values, one or the other could be faster or more compact (e.g. compare set of keys with large gaps in between and a sequential set of keys).

    0 讨论(0)
  • 2020-11-27 12:10

    Java Virtual Machine Specification describe the difference. "The tableswitch instruction is used when the cases of the switch can be efficiently represented as indices into a table of target offsets." The specification describes the more details.

    0 讨论(0)
提交回复
热议问题