I\'m doing some word segmentation experiments like the followings.
lst
is a sequence of characters, and output
is all the possible words.>
There are 8 options, each mirroring the binary numbers 0 through 7:
000
001
010
011
100
101
110
111
Each 0 and 1 represents whether or not the 2 letters at that index are "glued" together. 0 for no, 1 for yes.
>>> lst = ['a', 'b', 'c', 'd']
... output = []
... formatstr = "{{:0{}.0f}}".format(len(lst)-1)
... for i in range(2**(len(lst)-1)):
... output.append([])
... s = "{:b}".format(i)
... s = str(formatstr.format(float(s)))
... lstcopy = lst[:]
... for j, c in enumerate(s):
... if c == "1":
... lstcopy[j+1] = lstcopy[j] + lstcopy[j+1]
... else:
... output[-1].append(lstcopy[j])
... output[-1].append(lstcopy[-1])
... output
[['a', 'b', 'c', 'd'],
['a', 'b', 'cd'],
['a', 'bc', 'd'],
['a', 'bcd'],
['ab', 'c', 'd'],
['ab', 'cd'],
['abc', 'd'],
['abcd']]
>>>