Using Python I need to insert a newline character into a string every 64 characters. In Perl it\'s easy:
s/(.{64})/$1\\n/
How could this be
without regexp:
def insert_newlines(string, every=64):
lines = []
for i in xrange(0, len(string), every):
lines.append(string[i:i+every])
return '\n'.join(lines)
shorter but less readable (imo):
def insert_newlines(string, every=64):
return '\n'.join(string[i:i+every] for i in xrange(0, len(string), every))
The code above is for Python 2.x. For Python 3.x, you want to use range
and not xrange
:
def insert_newlines(string, every=64):
lines = []
for i in range(0, len(string), every):
lines.append(string[i:i+every])
return '\n'.join(lines)
def insert_newlines(string, every=64):
return '\n'.join(string[i:i+every] for i in range(0, len(string), every))
I suggest the following method:
"\n".join(re.findall("(?s).{,64}", s))[:-1]
This is, more-or-less, the non-RE method taking advantage of the RE engine for the loop.
On a very slow computer I have as a home server, this gives:
$ python -m timeit -s 's="0123456789"*100; import re' '"\n".join(re.findall("(?s).{,64}", s))[:-1]'
10000 loops, best of 3: 130 usec per loop
AndiDog's method:
$ python -m timeit -s "s='0123456789'*100; import re" 're.sub("(?s)(.{64})", r"\1\n", s)'
1000 loops, best of 3: 800 usec per loop
gurney alex's 2nd/Michael's method:
$ python -m timeit -s "s='0123456789'*100" '"\n".join(s[i:i+64] for i in xrange(0, len(s), 64))'
10000 loops, best of 3: 148 usec per loop
I don't consider the textwrap
method to be correct for the specification of the question, so I won't time it.
Changed answer because it was incorrect (shame on me!)
Just for the fun of it, the RE-free method using itertools
. It rates third in speed, and it's not Pythonic (too lispy):
"\n".join(
it.imap(
s.__getitem__,
it.imap(
slice,
xrange(0, len(s), 64),
xrange(64, len(s)+1, 64)
)
)
)
$ python -m timeit -s 's="0123456789"*100; import itertools as it' '"\n".join(it.imap(s.__getitem__, it.imap(slice, xrange(0, len(s), 64), xrange(64, len(s)+1, 64))))'
10000 loops, best of 3: 182 usec per loop
I'd go with:
import textwrap
s = "0123456789"*100
print '\n'.join(textwrap.wrap(s, 64))
taking @J.F. Sebastian's solution one step further, and this is nearly criminal :-)
import textwrap
s = "0123456789"*100
print textwrap.fill(s, 64)
look ma... no regexes! because as you know... http://regex.info/blog/2006-09-15/247
thanks for introducing us to textwrap module... although it's been in Python since 2.3, i've never been aware of it until now (yes, i'll admit that publically)!!
Tiny, not nice:
"".join(s[i:i+64] + "\n" for i in xrange(0,len(s),64))
itertools has a nice recipe for a function grouper
that is good for this, particularly if your final slice is less than 64 chars and you don't want a slice error:
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
Use like this:
big_string = <YOUR BIG STRING>
output = '\n'.join(''.join(chunk) for chunk in grouper(big_string, 64))