Python: find all possible word combinations with a sequence of characters (word segmentation)

后端 未结 4 2041
被撕碎了的回忆
被撕碎了的回忆 2021-01-14 05:39

I\'m doing some word segmentation experiments like the followings.

lst is a sequence of characters, and output is all the possible words.

4条回答
  •  一生所求
    2021-01-14 06:36

    There are 8 options, each mirroring the binary numbers 0 through 7:

    000
    001
    010
    011
    100
    101
    110
    111
    

    Each 0 and 1 represents whether or not the 2 letters at that index are "glued" together. 0 for no, 1 for yes.

    >>> lst = ['a', 'b', 'c', 'd']
    ... output = []
    ... formatstr = "{{:0{}.0f}}".format(len(lst)-1)
    ... for i in range(2**(len(lst)-1)):
    ...     output.append([])
    ...     s = "{:b}".format(i)
    ...     s = str(formatstr.format(float(s)))
    ...     lstcopy = lst[:]
    ...     for j, c in enumerate(s):
    ...         if c == "1":
    ...             lstcopy[j+1] = lstcopy[j] + lstcopy[j+1]
    ...         else:
    ...             output[-1].append(lstcopy[j])
    ...     output[-1].append(lstcopy[-1])
    ... output
    [['a', 'b', 'c', 'd'],
     ['a', 'b', 'cd'],
     ['a', 'bc', 'd'],
     ['a', 'bcd'],
     ['ab', 'c', 'd'],
     ['ab', 'cd'],
     ['abc', 'd'],
     ['abcd']]
    >>> 
    

提交回复
热议问题