Why do Tcler suggest to brace your `expr`essions?

后端 未结 3 1416
我寻月下人不归
我寻月下人不归 2020-11-28 12:03

We can evaluate the two expression in two possible ways:

   set a 1
   set b 1
   puts [expr $a + $b ]
   puts [expr {$a + $b } ]

But why h

相关标签:
3条回答
  • 2020-11-28 12:52
    • Without the braces, the parameters of expr are converted first to string, and then back again to numbers.
    • Without braces, they are prone to injection attacks very similar to SQL injection attacks.
    • You can get rounding errors you wouldn't want if you don't use braces.
    • With the braces, the expressions can be compiled.

    I based this on Johannes Kuhn's answer which was posted a while back, and you can find out in numbers, how the braced functions are more efficient on the wiki, along with other interesting stuff about the differences and where you can omit the braces to actually get the results you want.

    0 讨论(0)
  • 2020-11-28 12:58

    The "problem" with expr is that it implements its own "mini language", which includes, among other things, variable substitution (replacing those $a-s with their values) and command substitution (replacing those [command ...] things with the results of running commands), so basically the process of evaluating expr $a + $b goes like this:

    1. The Tcl interpreter parses out four words — expr, $a, + and $b out of the source string. Since two of these words begin with $, variable substitution takes place so really there will be expr, 1, +, and 2.
    2. As usually, the first word is taken to be the name of a command, and others are arguments to it, so the Tcl interpreter looks up a command named expr, and executes it passing it the three arguments: 1, +, and 2.
    3. The implementation if expr then concatenates all the arguments passed to it interpreting them as strings, obtaining a string 1 + 2.
    4. This string is then parsed again — this time by the expr machinery, according to its own rules which include variable- and command substitutions, as already mentioned.

    What follows:

    • If you brace your expressions, like in expr {$a + $b}, grouping provided by those curly braces inhibits interpretation by the Tcl interpreter1 of the script intended to be parsed by expr itself. This means in our toy example the expr command would see exactly one argument, $a + $b, and will perform substitutions itself.
    • "Double parsing" explained above might lead to security problems.

      For example, in the following code

      set a {[exec echo rm -rf $::env(HOME)]}
      set b 2
      expr $a + $b
      

      The expr command will itself parse a string [exec echo rm -rf $::env(HOME)] + 2. Its evaluation will fail, but by that time, the contents of your home directory will be supposedly gone. (Note that a kind Tcler placed echo in front of rm in a later edit to my answer in an attempt to save the necks of random copypasters, so the command as written won't call rm but if you remove echo from it, it will.)

    • Double parsing inhibits certain optimisations the Tcl engine can do when dealing with calls to expr.

    1 Well, almost — "backslash+newline" sequences are still processed even inside {...} blocks.

    0 讨论(0)
  • 2020-11-28 13:00

    It most certainly has security issues. In particular, it will treat the variables' contents as expression fragments rather than values, and this lets all sort of problems occur. If that's not enough, the same problems also totally slay performance because there is no way to generate reasonably optimal code for it: the bytecode generated will be far less efficient since all it can do is assemble the expression string and send it for a second round of parsing.

    Let's drill down to the details

    % tcl::unsupported::disassemble lambda {{} {
        set a 1; set b 2
        puts [expr {$a + $b}]
        puts [expr $a + $b]
    }}
    ByteCode 0x0x50910, refCt 1, epoch 3, interp 0x0x31c10 (epoch 3)
      Source "\n    set a 1; set b 2\n    puts [expr {$a + $b}]\n    put"
      Cmds 6, src 72, inst 65, litObjs 5, aux 0, stkDepth 6, code/src 0.00
      Proc 0x0x6d750, refCt 1, args 0, compiled locals 2
          slot 0, scalar, "a"
          slot 1, scalar, "b"
      Commands 6:
          1: pc 0-4, src 5-11          2: pc 5-18, src 14-20
          3: pc 19-37, src 26-46       4: pc 21-34, src 32-45
          5: pc 38-63, src 52-70       6: pc 40-61, src 58-69
      Command 1: "set a 1"
        (0) push1 0     # "1"
        (2) storeScalar1 %v0    # var "a"
        (4) pop 
      Command 2: "set b 2"
        (5) startCommand +13 1  # next cmd at pc 18
        (14) push1 1    # "2"
        (16) storeScalar1 %v1   # var "b"
        (18) pop 
      Command 3: "puts [expr {$a + $b}]"
        (19) push1 2    # "puts"
      Command 4: "expr {$a + $b}"
        (21) startCommand +14 1     # next cmd at pc 35
        (30) loadScalar1 %v0    # var "a"
        (32) loadScalar1 %v1    # var "b"
        (34) add 
        (35) invokeStk1 2 
        (37) pop 
      Command 5: "puts [expr $a + $b]"
        (38) push1 2    # "puts"
      Command 6: "expr $a + $b"
        (40) startCommand +22 1     # next cmd at pc 62
        (49) loadScalar1 %v0    # var "a"
        (51) push1 3    # " "
        (53) push1 4    # "+"
        (55) push1 3    # " "
        (57) loadScalar1 %v1    # var "b"
        (59) concat1 5 
        (61) exprStk 
        (62) invokeStk1 2 
        (64) done 
    

    In particular, look at the addresses 30–34 (the compilation of expr {$a + $b}) and compare with addresses 49–61 (the compilation of expr $a + $b). The optimal code reads the values out of the two variables and just adds them; the unbraced code has to read the variables and concatenate with the literal parts of the expression, and then fires the result into exprStk which is the “evaluate an expression string” operation. (The relative number of bytecodes isn't the problem; the problem is the runtime evaluation.)

    For how fundamental these differences could be, consider setting a to 1 || 0 and b to [exit 1]. In the case of the precompiled version, Tcl will just try to treat both sides as numbers to add (neither of which is actually numeric; you'll get an error). In the case of the dynamic version… well, can you predict it by inspection?

    So what do you do?

    Optimal Tcl code should always limit the amount of runtime evaluation of expressions it performs; you can usually get it down to nothing at all unless you're doing something that takes an expression defined by the user or something like that. Where you have to have it, try to generate a single expression string in a variable and then just use expr $thatVar rather than anything more complex. If you're wanting to do adding a list of numbers (or generally applying any operator to combine them), consider using this:

    set sum [tcl::mathop::+ {*}$theList]
    

    instead of:

    set sum [expr [join $theList "+"]]
    

    (Also, never use a dynamic expression with if, for or while as that will suppress a lot of compilation.)

    Remember, with Tcl it's (usually) the case that safe code is fast code. You want fast and safe code, right?

    0 讨论(0)
提交回复
热议问题