UPDATE: Combinatorics and unranking was eventually what I needed. The links below helped alot:
http://msdn.microsoft.com/en-us/library/aa289166(v=vs.71).aspx
ht
There's a very efficient algorithm for this problem, which is also contained in the recently published:
Knuth, The Art of Computer Programming, Volume 4A (section 7.2.1.3).
Since you don't care about the order in which the combinations are generated, let's use the lexicographic order of the combinations where each combination is listed in descending order. Thus for r=3, the first 11 combinations of 3 symbols would be: 210, 310, 320, 321, 410, 420, 421, 430, 431, 432, 510. The advantage of this ordering is that the enumeration is independent of n; indeed it is an enumeration over all combinations of 3 symbols from {0, 1, 2, …}.
There is a standard method to directly generate the ith combination given i, so to test whether a symbol s
is part of the ith combination, you can simply generate it and check.
How many combinations of r symbols start with a particular symbol s? Well, the remaining r-1 positions must come from the s symbols 0, 1, 2, …, s-1, so it's (s choose r-1), where (s choose r-1) or C(s,r-1) is the binomial coefficient denoting the number of ways of choosing r-1 objects from s objects. As this is true for all s, the first symbol of the ith combination is the smallest s such that
∑k=0s(k choose r-1) ≥ i.
Once you know the first symbol, the problem reduces to finding the (i - ∑k=0s-1(k choose r-1))-th combination of r-1 symbols, where we've subtracted those combinations that start with a symbol less than s.
Python code (you can write C(n,r)
more efficiently, but this is fast enough for us):
#!/usr/bin/env python
tC = {}
def C(n,r):
if tC.has_key((n,r)): return tC[(n,r)]
if r>n-r: r=n-r
if r<0: return 0
if r==0: return 1
tC[(n,r)] = C(n-1,r) + C(n-1,r-1)
return tC[(n,r)]
def combination(r, k):
'''Finds the kth combination of r letters.'''
if r==0: return []
sum = 0
s = 0
while True:
if sum + C(s,r-1) < k:
sum += C(s,r-1)
s += 1
else:
return [s] + combination(r-1, k-sum)
def Func(N, r, i, s): return s in combination(r, i)
for i in range(1, 20): print combination(3, i)
print combination(500, 10000000000000000000000000000000000000000000000000000000000000000)
Note how fast this is: it finds the 10000000000000000000000000000000000000000000000000000000000000000th combination of 500 letters (it starts with 542) in less than 0.5 seconds.