问题
I have a pandas data frame column and I need to modify any entry of that column that starts with a 2. Right now, I'm using this which works, but is very, very slow:
for i, row in df.iterrows():
if df['IDnumber'][i].startswith('2') == True:
'''Do some stuff'''
I feel (read: know) there's a more efficent way to do this without using a for loop but I can't seem to find it.
Other things I've tried:
if df[df['IDnumber'].str[0]] == '2':
'''Do some stuff'''
if df[df['IDnumber'].str.startswith('2')] == True:
'''Do some stuff'''
Which respectively give the errors:
KeyError: "['2' '2' '2' ..., '1' '1' '1'] not in index"
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
回答1:
Do you mean you want to filter rows where the value from a string column starts with some character?
>>> df
foobar
0 0foo
1 1foo
2 2foo
3 3foo
4 4foo
5 5foo
6 0bar
7 1bar
8 2bar
9 3bar
10 4bar
11 5bar
>>> df.loc[(df.foobar.str.startswith('2'))]
foobar
2 2foo
8 2bar
Then it is:
>>> begining_with_2 = df.loc[(df.foobar.str.startswith('2'))]
>>> for i, row in begining_with_2.iterrows():
... print(row.foobar)
2foo
2bar
来源:https://stackoverflow.com/questions/47080315/efficiently-search-for-first-character-of-a-string-in-a-pandas-dataframe