python: padding punctuation with white spaces (keeping punctuation)

前端 未结 3 893
青春惊慌失措
青春惊慌失措 2021-02-05 19:16

What is an efficient way to pad punctuation with whitespace?

input:

s = \'bla. bla? bla.bla! bla...\'

desired output:

          


        
相关标签:
3条回答
  • 2021-02-05 19:56

    You can use a regular expression to match the punctuation characters you are interested and surround them by spaces, then use a second step to collapse multiple spaces anywhere in the document:

    s = 'bla. bla? bla.bla! bla...'
    import re
    s = re.sub('([.,!?()])', r' \1 ', s)
    s = re.sub('\s{2,}', ' ', s)
    print(s)
    

    Result:

    bla . bla ? bla . bla ! bla . . .
    
    0 讨论(0)
  • 2021-02-05 20:09

    If you use python3, use the maketrans() function.

    import string   
    text = text.translate(str.maketrans({key: " {0} ".format(key) for key in string.punctuation}))
    
    0 讨论(0)
  • 2021-02-05 20:13

    This will add exactly one space if one is not present, and will not ruin existing spaces or other white-space characters:

    s = re.sub('(?<! )(?=[.,!?()])|(?<=[.,!?()])(?! )', r' ', s)
    

    This works by finding a zero-width position between a punctuation and a non-space, and adding a space there.
    Note that is does add a space on the beginning or end of the string, but it can be easily done by changing the look-arounds to (?<=[^ ]) and (?=[^ ]).

    See in in action: http://ideone.com/BRx7w

    0 讨论(0)
提交回复
热议问题