I am trying to map the str.split
function to an array of string. namely, I would like to split all the strings in a string array that follow the same format. Any id
Use map in conjunction with a function. A neat way is to use a lambda function:
>>> a=['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']
>>> map(lambda s: s.split(), a)
[['2011-12-22', '46:31:11'], ['2011-12-20', '20:19:17'],
['2011-12-20', '01:09:21']]
map(lambda x: x.split(), a)
but, using a list comprehension [x.split() for x in a]
is much clearer in this case.
Though it isn't well known, there is a function designed just for this purpose, operator.methodcaller:
>>> from operator import methodcaller
>>> a = ['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']
>>> map(methodcaller("split", " "), a)
[['2011-12-22', '46:31:11'], ['2011-12-20', '20:19:17'], ['2011-12-20', '01:09:21']]
This technique is faster than equivalent approaches using lambda expressions.
Community wiki answer to compare other answers given
>>> from timeit import Timer
>>> t = {}
>>> t['methodcaller'] = Timer("map(methodcaller('split', ' '), a)", "from operator import methodcaller; a=['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']")
>>> t['lambda'] = Timer("map(lambda s: s.split(), a)", "a = ['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']")
>>> t['listcomp'] = Timer("[s.split() for s in a]", "a = ['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']")
>>> for name, timer in t.items():
... print '%s: %.2f usec/pass' % (name, 1000000 * timer.timeit(number=100000)/100000)
...
listcomp: 2.08 usec/pass
methodcaller: 2.87 usec/pass
lambda: 3.10 usec/pass
This is how I do it:
>>> a=['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']
>>> map(str.split, a)
[['2011-12-22', '46:31:11'], ['2011-12-20', '20:19:17'], ['2011-12-20', '01:09:21']]
This only works when you know you have a list of str
(i.e. not just a list of things that implement the split
method in a way compatible with str
). It also relies on using the default behaviour of split()
, which splits on any whitespace, rather than using x.split(' ')
, which splits on space characters only (i.e. not tabs, newlines, or other whitespace), because you can't pass another argument using this method. For calling behaviour more complex than this, I would use a list comprehension.