问题
I would like your opinion on the time and space complexity of this algorithm I implemented (in Python) to calculate the complexity of algorithm to print all valid (i.e., properly opened and closed) combinations of n-pairs of parentheses (see all valid combinations of n-pair of parenthesis)
def find_par_n(n):
s = set(["()"])
for i in range(2, n + 1):
set_to_add = set()
for str_ in s:
set_temp = set()
ana = set()
for j in range(len(str_) + 1):
str_c = str_[0:j] + '(' + str_[j:]
if str_c in ana:
continue
ana.add(str_c)
for k in range(j + 1, len(str_) + 1):
str_a = str_c
str_a = str_a[0:k] + ')' + str_a[k:]
set_temp.add(str_a)
set_to_add.update(set_temp)
s = set_to_add
return s
Most likely the algorithm can be improved in terms of time and space. Please share your thoughts.
回答1:
Let's try with a simpler version instead.
Some theorems:
- Properly paired string are of even length. Don't ask me to prove this one, please.
- Given a properly paired string of length
2k
, it is possible to construct all strings of length2(k + 1)
by inserting pairs of parenthesis'()'
at every possible location within the string. There are2k + 1
such positions. - To generate all
n
pairs, we need to recursive into the insertion stepn
times (and get strings of length2n
.
Note that this is not enough to generate all unique properly paired strings, because e.g. inserting ()
into ()
yields the same string twice (()()
). However as an upper bound it should suffice.
def add_parens(s, nparens):
if not s:
add_parens('()', nparens)
max_length = 2 * nparens
if len(s) == max_length:
print(s)
else:
for i in range(0, len(s)):
add_parens(s[0:i] + '()' + s[i:], nparens)
n = 5
add_parens('', n)
Time complexity:
- There is
1
insertion point for the empty string. - There are
3
insertion points for()
. ...
So all in all:
T(n) = 1 * 3 * ... * 2n + 1 ~ O(n!)
Space complexity for the recursive version is O(n(2n + 1))
, however I'm pretty sure it is possible to bring this down to linear.
回答2:
For optimal results, avoid sets. If you generate each possibility exactly once, you can just put them into a vector.
Here's a very simple recursion which is slightly slower than @bcdan's answer:
def rget(n):
if n == 0:
return ['']
else:
return [fst + '(' + snd + ')' for i in range(n)
for fst in rget(i)
for snd in rget(n-i-1)]
If you wanted a generator rather than a complete list of results, a variant of the above is probably ideal, but for the case of a complete list, it's quite a bit faster to compute the results of each j from 0 to n, using the previously computed lists instead of a recursive call:
def iget(n):
res = [['']]
for j in range(1, n+1):
res.append([fst + '(' + snd + ')' for i in range(j)
for fst in rget(i)
for snd in rget(j-i-1)])
return res[n]
This turns out to be an order of magnitude faster (using Python 3.3.2).
Why it works
You can decompose any balanced string into two balanced substrings. This is not a unique decomposition, but it can be made unique by choosing the shortest possible non-empty balanced suffix. The shortest nonempty balanced suffix has the property that the starting (
matches the closing )
; if that were not the case, it could be decomposed into two shorter non-empty balanced sequences. So the recursive consists of finding all sequences of the form Fst(Snd)
where the sum of the sizes of Fst and Snd is n-1.
Time complexity:
The number of balanced sequences with n pairs of parentheses is the nth Catalan number Cn, which is O(4nn2/3). The above code generates each sequence in O(n) (because the string concatenation takes O(n) time) for a total complexity of O(4nn5/3)
回答3:
Recursive version seems to be very simple and it spends time only on valid branches:
def gen(sofar, l, r, n):
"""
sofar: string generated so far
l: used left parentheses
r: used right parentheses
n: total required pairs
"""
result = []
if l == n and r == n:
# solution found
result.append(sofar)
else:
if l > r:
# can close
result.extend(gen(sofar + ")", l, r + 1, n))
if l < n:
# can open
result.extend(gen(sofar + "(", l + 1, r, n))
return result
def generate(n):
return gen("", 0, 0, n)
print generate(4)
回答4:
The number of possible combinations is the Catalan Number. So the complexity is at least the one specified by that number. https://en.wikipedia.org/wiki/Catalan_number
回答5:
I got curious and wrote my own version. Here are some specs (this is all in Python 2.7):
n=8
Mine: 21 ms
Yours: 89 ms
n=10
Mine: 294 ms
Yours: 1564 ms
def get(n):
def find(current):
out = []
last_index = 0
for j in range(current.count('(')):
last_index = current.index('(',last_index) + 1
out.append(current[:last_index] + '()' + current[last_index:])
return out
seed = '()'
current = '()'
temp = set(['()'])
for i in range(n):
new = set()
for thing in temp:
new.update(find(thing))
temp = new
return [a[1:-1] for a in temp]
I think the biggest difference is that mine only has 3 for loops, and yours has 4. This happens at the str.count
function. Using this built-in probably contributes significantly to speed. Also, after each stage I identified duplicates, and this cuts down exponentially on time.
Actually, the biggest difference I see is that I only insert after parenthesis, and then strip the outer two when it is finished. It creates way fewer duplicates. So ((())())
is created, and reduced to its correct form of (())()
after removing the enclosing ()
.
来源:https://stackoverflow.com/questions/31479647/calculating-the-complexity-of-algorithm-to-print-all-valid-i-e-properly-opene