Does map() iterate through the list like \"for\" would? Is there a value in using map vs for?
If so, right now my code looks like this:
for item in item
The main advantage of map
is when you want to get the result of some calculation on every element in a list. For example, this snippet doubles every value in a list:
map(lambda x: x * 2, [1,2,3,4]) #=> [2, 4, 6, 8]
It is important to note that map
returns a new list with the results. It does not modify the original list in place.
To do the same thing with for
, you would have to create an empty list and add an extra line to the for
body to add the result of each calculation to the new list. The map
version is more concise and functional.
You could use map instead of the for
loop you've shown, but since you do not appear to use the result of item.my_func()
, this is not recommended. map
should be used if you want to apply a function without side-effects to all elements of a list. In all other situations, use an explicit for-loop.
Also, as of Python 3.0 map
returns a generator, so in that case map
will not behave the same (unless you explicitly evaluate all elements returned by the generator, e.g. by calling list on it).
Edit: kibibu asks in the comments for a clarification on why map
's first argument should not be a function with side effects. I'll give answering that question a shot:
map
is meant to be passed a function f
in the mathematical sense. Under such circumstances it does not matter in which order f
is applied to the elements of the second argument (as long as they are returned in their original order, of course). More importantly, under those circumstances map(g, map(f, l))
is semantically equivalent to map(lambda x: g(f(x)), l)
, regardless of the order in which f
and g
are applied to their respective inputs.
E.g., it doesn't matter whether map
returns and iterator or a full list at once. However, if f
and/or g
cause side effects, then this equivalence is only guaranteed if the semantics of map(g, map(f, l))
are such that at any stage g
is applied to the first n elements returned by map(f, l)
before map(f, l)
applies f
to the (n + 1)st element of l
. (Meaning that map
must perform the laziest possible iteration---which it does in Python 3, but not in Python 2!)
Going one step further: even if we assume the Python 3 implementation of map
, the semantic equivalence may easily break down if the output of map(f, l)
is e.g. passed through itertools.tee before being supplied to the outer map
call.
The above discussion may seem of a theoretic nature, but as programs become more complex, they become more difficult to reason about and therefore harder to debug. Ensuring that some things are invariant alleviates that problem somewhat, and may in fact prevent a whole class of bugs.
Lastly, map
reminds many people of its truly functional counterpart in various (purely) functional languages. Passing it a "function" with side effects will confuse those people. Therefore, seeing as the alternative (i.e., using an explicit loop) is not harder to implement than a call to map
, it is highly recommended that one restricts use of map
to those cases in which the function to be applied does not cause side effects.
Use an explicit for-loop when you don't need a list of results back (eg. functions with side-effects).
Use a list comprehension when you do need a list of results back (eg. functions that return a value based directly on the input).
Use map() when you're trying to convince Lisp users that Python is worth using. ;)
You can switch Your map
to some cool threaded OR multiprocessing OR distributed computing framework if You need to. Disco is an example of distributed, resistant to failures erlang-and-python based framework. I configured it on 2 boxes of 8 cores and now my program runs 16 times faster, thanks to the Disco cluster, however I had to rewrite my program from list comprehensions and for loops to map/reduce.
It's the same deal to write a program using for loops and list comprehensions and map/reduce, but when You need it to run on a cluster, You can do it almost for free if You used map/reduce. If You didn't, well, You will have to rewrite.
Beware: as far as I know, python 2.x returns a list instead of an iterator from map. I've heard this can be bypassed by using iter.imap()
(never used it though).
map(lambda item: item.my_func(), items)
There is a slight semantic difference, which is probably closed in python language spec. The map is explicitly parallelizable, while for only in special situations. Code can break out from for, but only escape with exception from map.
In my opinion map shouldn't also guarantee order of function application while for must. AFAIK no python implementation is currently able to do this auto-parallelization.