问题
I have a script that loops through a series of four (or less) characters strings. For example:
aaaa
aaab
aaac
aaad
If have been able to implement it with nested for loops like so:
chars = string.digits + string.uppercase + string.lowercase
for a in chars:
print '%s' % a
for b in chars:
print '%s%s' % (a, b)
for c in chars:
print '%s%s%s' % (a, b, c)
for d in chars:
print '%s%s%s%s' % (a, b, c, d)
Is this sort of loop nesting a bad thing, and if so, what would be a better way of accomplishing what I am doing?
回答1:
import string
import itertools
chars = string.digits + string.letters
MAX_CHARS = 4
for nletters in range(MAX_CHARS):
for word in itertools.product(chars, repeat=nletters + 1):
print (''.join(word))
That'll print all 15018570
words you're looking for. If you want more/less words just change the MAX_CHARS
variable. It will still have just two for
s for any number of chars, and you don't have to repeat yourself. And is pretty readable. .
回答2:
I'm going to submit my answer as the most readable and least scalable :)
import string
chars = [''] + list(string.lowercase)
strings = (a+b+c+d for a in chars
for b in chars
for c in chars
for d in chars)
for string in strings:
print string
EDIT: Actually, this is incorrect, as it will produce duplicates of all strings of length<4. Removing the empty string from the chars
array would just produce 4-char strings.
Normally I'd delete this answer, but I still kinda like it if you need to generate strings of the same length.
回答3:
Write for the programmer first - the computer second.
If it's clear and obvious to understand then its correct.
If speed matters AND the compiler doesn't optimise it anyway AND if you measure it AND it is the problem - then think of a faster cleverer way!
回答4:
I don't think it's a bad thing, provided you understand (and document :-) it. I don't doubt there may be a more pythonic way or clever solution (with lambdas or whatnot) but I've always favored readability over cleverness.
Since you have to generate all possibilities of 1-, 2-, 3- and 4-character "words", this method is as good as any. I'm not sure how long it would take since you're effectively generating (very roughly) 14 million lines of output (but probably every solution would have that problem).
Pre-calculating the common prefixes may provide a speed boost but you'd be better off measuring it to check (always check, never assume):
chars = string.digits + string.uppercase + string.lowercase
for a in chars:
print a
for b in chars:
ab = '%s%s' % (a, b)
print ab
for c in chars:
abc = '%s%s' % (ab, c)
print abc
for d in chars:
print '%s%s' % (abc, d)
EDIT: I actually did some benchmarks (with Windows-Python 2.6.1) - this version takes about 2.25 time units compared to the original 2.84 so it's 26% faster. I think that might warrant its use (again, as long as it's documented clearly what it's trying to achieve).
回答5:
@nosklo's and @Triptych's solutions produce different results:
>>> list(map(''.join, itertools.chain.from_iterable(itertools.product("ab",
... repeat=r) for r in range(4)))) # @nosklo's
['', 'a', 'b', 'aa', 'ab', 'ba', 'bb', 'aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba', 'bbb']
>>> ab = ['']+list("ab")
>>> list(map(''.join, (a+b+c for a in ab for b in ab for c in ab)))
['', 'a', 'b', 'a', 'aa', 'ab', 'b', 'ba', 'bb', 'a', 'aa', 'ab', 'aa', 'aaa', 'aab', 'ab', 'aba', 'abb', 'b', 'ba', 'bb', 'ba', 'baa', 'bab', 'bb', 'bba', 'bbb']
Here's modified @Triptych's solution that produce the same output as the @nosklo's one:
>>> ab = "ab"
>>> list(map(''.join, itertools.chain([''], ab, (a+b for a in ab for b in ab),
... (a+b+c for a in ab for b in ab for c in ab))))
['', 'a', 'b', 'aa', 'ab', 'ba', 'bb', 'aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba', 'bbb']
回答6:
There are many algorithms for generating every permutation of a set. What you want here is a related problem, but not directly analagous. Suggested Reading
回答7:
It doesn't exactly answer the question, but this would return the n
th combination for the given maximum length and characters in the alphabet to use:
#!/usr/bin/python
def nth_combination(n, maxlen=4, alphabet='abc'):
"""
>>> print ','.join(nth_combination(n, 1, 'abc') for n in range(3))
a,b,c
>>> print ','.join(nth_combination(n, 2, 'abc') for n in range(12))
a,aa,ab,ac,b,ba,bb,bc,c,ca,cb,cc
>>> import string ; alphabet = string.ascii_letters + string.digits
>>> print ','.join(nth_combination(n, 4, alphabet) for n in range(16))
a,aa,aaa,aaaa,aaab,aaac,aaad,aaae,aaaf,aaag,aaah,aaai,aaaj,aaak,aaal,aaam
>>> print ','.join(nth_combination(n, 4, alphabet)
... for n in range(0, 14000000, 10**6))
a,emiL,iyro,mKz2,qWIF,u8Ri,zk0U,Dxav,HJi9,LVrM,P7Ap,UjJ1,YvSE,2H1h
"""
if maxlen == 1:
return alphabet[n]
offset, next_n = divmod(n, 1 + len(alphabet)**(maxlen-1))
if next_n == 0:
return alphabet[offset]
return alphabet[offset] + nth_combination(next_n-1, maxlen-1, alphabet)
if __name__ == '__main__':
from doctest import testmod
testmod()
This of course makes sense only if you need random access to the set of combinations instead of always iterating through them all.
If maxlen
is high, some speed optimization could be achieved e.g. by getting rid of string concatenation and re-calculating the length of alphabet
and maxlen-1
at each level of the recursion. A non-recursive approach might make sense, too.
来源:https://stackoverflow.com/questions/482146/replace-nested-for-loops-or-not