Python: find all possible word combinations with a sequence of characters (word segmentation)

后端未结

关注

 4  2041

被撕碎了的回忆 2021-01-14 05:39

I\'m doing some word segmentation experiments like the followings.

lst is a sequence of characters, and output is all the possible words.

4条回答

一生所求 (楼主)

2021-01-14 06:36

There are 8 options, each mirroring the binary numbers 0 through 7:

Each 0 and 1 represents whether or not the 2 letters at that index are "glued" together. 0 for no, 1 for yes.

>>> lst = ['a', 'b', 'c', 'd']
... output = []
... formatstr = "{{:0{}.0f}}".format(len(lst)-1)
... for i in range(2**(len(lst)-1)):
...     output.append([])
...     s = "{:b}".format(i)
...     s = str(formatstr.format(float(s)))
...     lstcopy = lst[:]
...     for j, c in enumerate(s):
...         if c == "1":
...             lstcopy[j+1] = lstcopy[j] + lstcopy[j+1]
...         else:
...             output[-1].append(lstcopy[j])
...     output[-1].append(lstcopy[-1])
... output
[['a', 'b', 'c', 'd'],
 ['a', 'b', 'cd'],
 ['a', 'bc', 'd'],
 ['a', 'bcd'],
 ['ab', 'c', 'd'],
 ['ab', 'cd'],
 ['abc', 'd'],
 ['abcd']]
>>>

0 讨论(0)

查看其它4个回答