A pythonic way to insert a space before capital letters

二次信任 提交于 2019-12-31 12:06:27

问题


I've got a file whose format I'm altering via a python script. I have several camel cased strings in this file where I just want to insert a single space before the capital letter - so "WordWordWord" becomes "Word Word Word".

My limited regex experience just stalled out on me - can someone think of a decent regex to do this, or (better yet) is there a more pythonic way to do this that I'm missing?


回答1:


You could try:

>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWord")
'Word Word Word'



回答2:


If there are consecutive capitals, then Gregs result could not be what you look for, since the \w consumes the caracter in front of the captial letter to be replaced.

>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWWWWWWWord")
'Word Word WW WW WW Word'

A look-behind would solve this:

>>> re.sub(r"(?<=\w)([A-Z])", r" \1", "WordWordWWWWWWWord")
'Word Word W W W W W W Word'



回答3:


Perhaps shorter:

>>> re.sub(r"\B([A-Z])", r" \1", "DoIThinkThisIsABetterAnswer?")



回答4:


Have a look at my answer on .NET - How can you split a “caps” delimited string into an array?

Edit: Maybe better to include it here.

re.sub(r'([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))', r'\1 ', text)

For example:

"SimpleHTTPServer" => ["Simple", "HTTP", "Server"]



回答5:


Maybe you would be interested in one-liner implementation without using regexp:

''.join(' ' + char if char.isupper() else char.strip() for char in text).strip()



回答6:


With regexes you can do this:

re.sub('([A-Z])', r' \1', str)

Of course, that will only work for ASCII characters, if you want to do Unicode it's a whole new can of worms :-)




回答7:


If you have acronyms, you probably do not want spaces between them. This two-stage regex will keep acronyms intact (and also treat punctuation and other non-uppercase letters as something to add a space on):

re_outer = re.compile(r'([^A-Z ])([A-Z])')
re_inner = re.compile(r'(?<!^)([A-Z])([^A-Z])')
re_outer.sub(r'\1 \2', re_inner.sub(r' \1\2', 'DaveIsAFKRightNow!Cool'))

The output will be: 'Dave Is AFK Right Now! Cool'




回答8:


I think regexes are the way to go here, but just to give a pure python version without (hopefully) any of the problems ΤΖΩΤΖΙΟΥ has pointed out:

def splitCaps(s):
    result = []
    for ch, next in window(s+" ", 2):
        result.append(ch)
        if next.isupper() and not ch.isspace():
            result.append(' ')
    return ''.join(result)

window() is a utility function I use to operate on a sliding window of items, defined as:

import collections, itertools

def window(it, winsize, step=1):
    it=iter(it)  # Ensure we have an iterator
    l=collections.deque(itertools.islice(it, winsize))
    while 1:  # Continue till StopIteration gets raised.
        yield tuple(l)
        for i in range(step):
            l.append(it.next())
            l.popleft()



回答9:


I agree that the regex solution is the easiest, but I wouldn't say it's the most pythonic.

How about:

text = 'WordWordWord'
new_text = ''

for i, letter in enumerate(text):
    if i and letter.isupper():
        new_text += ' '

    new_text += letter


来源:https://stackoverflow.com/questions/199059/a-pythonic-way-to-insert-a-space-before-capital-letters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!