I encountered the error
\'>\' not supported between instances of \'str\' and \'int\'
while trying to print the below lines in P
This is because values in 'text' column are of type str and you are comparing str with int. You can do a quick check for getting type of 'text' column.
print(type(survey_df_clean['text'][:1][0]))
For comparing you can do as following
survey_df_clean[survey_df_clean['text'].astype(int)>30]
This message suggests, that you try to compare a string object (str
) with an integer (int
).
The expression
survey_df_clean['text']
will probably return a string. Therefore, you cannot directly compare it with the number 30
. If you want to compare the length of the entry, you can use the pandas.Series.str.len()
operation as you can see here.
If this field should actuallty contain an integer, you can use this method (pandas.to_numeric
) to cast it from str
to int
.
First make sure that all value of survey_df_clean['text'] is the same, if you want to convert as numeric, do this :
survey_df_clean['text'] = pd.to_numeric(survey_df_clean['text'])
Then do this
survey_df_clean.loc[survey_df_clean['text']>30].shape
I had the same error message when trying to use that conditional. What intrigued me was that the same command had run correctly on another notebook.
The difference was in how I read the csv file. This was the troublesome one:
df=pd.read_csv('data.csv')
And when I put the decimal argument it worked:
df=pd.read_csv('data.csv', decimal=',')
Obviously, it'll depend on how your data is organized. ;)
survey_df_clean['text']
might have NAN or str values in it some where.
to find out :
survey_df_clean['text'].isnull().sum()
if they are,first take care of them then apply
print (survey_df_clean[survey_df_clean['text']>30].shape)