Can you tell me when to use these vectorization methods with basic examples?
I see that map
is a Series
method whereas the rest are
FOMO:
The following example shows apply and applymap applied to a DataFrame
.
map function is something you do apply on Series only. You cannot apply map on DataFrame.
The thing to remember is that apply can do anything applymap can, but apply has eXtra options.
The X factor options are: axis
and result_type
where result_type
only works when axis=1
(for columns).
df = DataFrame(1, columns=list('abc'),
index=list('1234'))
print(df)
f = lambda x: np.log(x)
print(df.applymap(f)) # apply to the whole dataframe
print(np.log(df)) # applied to the whole dataframe
print(df.applymap(np.sum)) # reducing can be applied for rows only
# apply can take different options (vs. applymap cannot)
print(df.apply(f)) # same as applymap
print(df.apply(sum, axis=1)) # reducing example
print(df.apply(np.log, axis=1)) # cannot reduce
print(df.apply(lambda x: [1, 2, 3], axis=1, result_type='expand')) # expand result
As a sidenote, Series map function, should not be confused with the Python map function.
The first one is applied on Series, to map the values, and the second one to every item of an iterable.
Lastly don't confuse the dataframe apply method with groupby apply method.