I\'m having a problem trying to get a character count column of the string values in another column, and haven\'t figured out how to do it efficiently.
for i
Here's one way to do it.
In [3]: df
Out[3]:
string
0 abcd
1 abcde
In [4]: df['len'] = df['string'].str.len()
In [5]: df
Out[5]:
string len
0 abcd 4
1 abcde 5
Pandas has a vectorised string method for this: str.len()
. To create the new column you can write:
df['char_length'] = df['string'].str.len()
For example:
>>> df
string
0 abcd
1 abcde
>>> df['char_length'] = df['string'].str.len()
>>> df
string char_length
0 abcd 4
1 abcde 5
This should be considerably faster than looping over the DataFrame with a Python for
loop.
Many other familiar string methods from Python have been introduced to Pandas. For example, lower
(for converting to lowercase letters), count
for counting occurrences of a particular substring, and replace
for swapping one substring with another.