Efficient Algorithm to compose valid expressions with specific target

前端 未结 1 924
天涯浪人
天涯浪人 2021-01-02 11:47

The problem is stated as: Given a string that contains only digits 0-9 and a target value, return all expressions that are created by adding some binary operators (+, -, or

相关标签:
1条回答
  • 2021-01-02 12:45

    With this kind of programming challenge, I start by trying to answer the questions:

    • How should the expressions be represented?
    • Can we reduce number of possible expressions?
    • Can we do less work for each expression?

    Representing expressions

    Problems that look like small programming languages tend to make me think Lisp. The problem is asking us to generate the series:

    123
    (* 12 3)
    (+ 12 3)
    ...
    (- (- 1 2) 3)
    

    A binary expression in basically a 3-tuple of (operator, left, right) where left and right can also be expressions. The order of the components doesn't actually matter. Python has tuples, and in the operator module it has functions for the various binary ops. So, I'd plan to build expressions in the following form:

    (operator.sub, (operator.sub, 1, 2), 3)
    

    Which can then be evaluated with a (mostly) simple recursive function:

    def compute(expr):
        if isinstance(expr, tuple):
                op, left, right = expr
                return op(compute(left), compute(right))
        return expr
    

    Reducing possibilities

    From the problem description, it seems there will be an exponential number of possible expressions per digit given. Can we eliminate some of these part way through creating all the permutations?

    For example, take a six digit input and the target result 5. During the process of creating the permutations, imagine the following expression has been created from the first four digits, and there are two left to be handled:

    (* 42 81) '??'
    

    3696 is a big number, are any of the expressions from this point even capable of getting a result of just 5? Can we skip creating them altogether?

    Unfortunately, digits near the end can still make major changes:

    (+ (* (* 42 81) 0) 5)
    

    There may be some branches we could avoid, but we're going to have to consider most expressions.

    Doing less work

    Okay, given we'll have to actually get the result of a very large number of expressions, is there some other way to save effort?

    Lets imagine we're part way through generating a sequence, with these three final expressions generated one after the other:

    ...
    (* (- 8 (* 3 6)) 1)
    (+ (- 8 (* 3 6)) 1)
    (- (- 8 (* 3 6)) 1)
    ...
    

    They all give different results, [12, 13, 11], but that inner part (- 8 (* 3 6)) is the same, and will always be 12. Our solution should look to take advantage of this.

    An answer

    For anyone in need of spoilers, I've put up branches for an initial implementation that calculates every expression from the top, a minor change that memoises the calculation, and a final one that precomputes results as the expressions are being generated plus some minor tweaks.

    • 17.40s elapsed 6180k max mem original from question
    • 20.60s elapsed 6284k max mem without eval from question
    • 4.65s elapsed 5356k max mem my initial
    • 2.71s elapsed 5316k max mem my memoised
    • 1.50s elapsed 5356k max mem my precomputed

    Some notes on my implementation. The generate() function creates the candidate expressions by considering each point in the string and creating the possible next states. For example, at the start, both move the marker along, and split off the first number:

    '3|456237490' ->
        '34|56237490' -> ...
        3 '4|56237490' ->
    

    Each pending state is pushed to a list, and the current one to consider is popped off each time through the loop. Continuing from the state at the end, the next possibilities are moving the marker along again, and splitting a number to make one of the three expressions.

            3 '45|6237490' -> ...
            (* 3 4) '5|6237490' -> ...
            (+ 3 4) '5|6237490' -> ...
            (- 3 4) '5|6237490' -> ...
    

    I have glossed over one wrinkle with operator precedence so far. When handling multiplication, we may need to rewrite an existing expression. Consider:

    (+ 1 2) '3|' ->
        (* (+ 1 2) 3) '' # ???
        (+ (+ 1 2) 3) ''
        (- (+ 1 2) 3) ''
    

    For addition and subtraction this is fine, order won't matter. However, 2 * 3 has to happen before 1 + .... In short, we need to push the multiplication inside:

    (+ 1 2) 3 -> (+ 1 (* 2 3))
    

    There are neat ways to handle this by storing a bit more information about your operations beyond just the function to execute them. For this problem that's not really required, nor are other possible transformations like combining multiple expressions or factoring out irrelevant parts.

    Final implementation note, just to be difficult I made both the direction of iteration and (initially) the layout of the expressions backwards.

    0 讨论(0)
提交回复
热议问题