The problem is stated as: Given a string that contains only digits 0-9 and a target value, return all expressions that are created by adding some binary operators (+, -, or
With this kind of programming challenge, I start by trying to answer the questions:
Problems that look like small programming languages tend to make me think Lisp. The problem is asking us to generate the series:
123
(* 12 3)
(+ 12 3)
...
(- (- 1 2) 3)
A binary expression in basically a 3-tuple of (operator, left, right)
where left and right can also be expressions. The order of the components doesn't actually matter. Python has tuples, and in the operator
module it has functions for the various binary ops. So, I'd plan to build expressions in the following form:
(operator.sub, (operator.sub, 1, 2), 3)
Which can then be evaluated with a (mostly) simple recursive function:
def compute(expr):
if isinstance(expr, tuple):
op, left, right = expr
return op(compute(left), compute(right))
return expr
From the problem description, it seems there will be an exponential number of possible expressions per digit given. Can we eliminate some of these part way through creating all the permutations?
For example, take a six digit input and the target result 5
. During the process of creating the permutations, imagine the following expression has been created from the first four digits, and there are two left to be handled:
(* 42 81) '??'
3696
is a big number, are any of the expressions from this point even capable of getting a result of just 5
? Can we skip creating them altogether?
Unfortunately, digits near the end can still make major changes:
(+ (* (* 42 81) 0) 5)
There may be some branches we could avoid, but we're going to have to consider most expressions.
Okay, given we'll have to actually get the result of a very large number of expressions, is there some other way to save effort?
Lets imagine we're part way through generating a sequence, with these three final expressions generated one after the other:
...
(* (- 8 (* 3 6)) 1)
(+ (- 8 (* 3 6)) 1)
(- (- 8 (* 3 6)) 1)
...
They all give different results, [12, 13, 11]
, but that inner part (- 8 (* 3 6))
is the same, and will always be 12
. Our solution should look to take advantage of this.
For anyone in need of spoilers, I've put up branches for an initial implementation that calculates every expression from the top, a minor change that memoises the calculation, and a final one that precomputes results as the expressions are being generated plus some minor tweaks.
17.40s elapsed 6180k max mem
original from question20.60s elapsed 6284k max mem
without eval from question4.65s elapsed 5356k max mem
my initial2.71s elapsed 5316k max mem
my memoised1.50s elapsed 5356k max mem
my precomputedSome notes on my implementation. The generate()
function creates the candidate expressions by considering each point in the string and creating the possible next states. For example, at the start, both move the marker along, and split off the first number:
'3|456237490' ->
'34|56237490' -> ...
3 '4|56237490' ->
Each pending state is pushed to a list, and the current one to consider is popped off each time through the loop. Continuing from the state at the end, the next possibilities are moving the marker along again, and splitting a number to make one of the three expressions.
3 '45|6237490' -> ...
(* 3 4) '5|6237490' -> ...
(+ 3 4) '5|6237490' -> ...
(- 3 4) '5|6237490' -> ...
I have glossed over one wrinkle with operator precedence so far. When handling multiplication, we may need to rewrite an existing expression. Consider:
(+ 1 2) '3|' ->
(* (+ 1 2) 3) '' # ???
(+ (+ 1 2) 3) ''
(- (+ 1 2) 3) ''
For addition and subtraction this is fine, order won't matter. However, 2 * 3
has to happen before 1 + ...
. In short, we need to push the multiplication inside:
(+ 1 2) 3 -> (+ 1 (* 2 3))
There are neat ways to handle this by storing a bit more information about your operations beyond just the function to execute them. For this problem that's not really required, nor are other possible transformations like combining multiple expressions or factoring out irrelevant parts.
Final implementation note, just to be difficult I made both the direction of iteration and (initially) the layout of the expressions backwards.