I\'ve got a file whose format I\'m altering via a python script. I have several camel cased strings in this file where I just want to insert a single space before the capital l
Maybe you would be interested in one-liner implementation without using regexp:
''.join(' ' + char if char.isupper() else char.strip() for char in text).strip()
With regexes you can do this:
re.sub('([A-Z])', r' \1', str)
Of course, that will only work for ASCII characters, if you want to do Unicode it's a whole new can of worms :-)
I agree that the regex solution is the easiest, but I wouldn't say it's the most pythonic.
How about:
text = 'WordWordWord'
new_text = ''
for i, letter in enumerate(text):
if i and letter.isupper():
new_text += ' '
new_text += letter
Have a look at my answer on .NET - How can you split a “caps” delimited string into an array?
Edit: Maybe better to include it here.
re.sub(r'([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))', r'\1 ', text)
For example:
"SimpleHTTPServer" => ["Simple", "HTTP", "Server"]
You could try:
>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWord")
'Word Word Word'
I think regexes are the way to go here, but just to give a pure python version without (hopefully) any of the problems ΤΖΩΤΖΙΟΥ has pointed out:
def splitCaps(s):
result = []
for ch, next in window(s+" ", 2):
result.append(ch)
if next.isupper() and not ch.isspace():
result.append(' ')
return ''.join(result)
window() is a utility function I use to operate on a sliding window of items, defined as:
import collections, itertools
def window(it, winsize, step=1):
it=iter(it) # Ensure we have an iterator
l=collections.deque(itertools.islice(it, winsize))
while 1: # Continue till StopIteration gets raised.
yield tuple(l)
for i in range(step):
l.append(it.next())
l.popleft()