I have the following DataFrame:
daysago line_race rating rw wrating
line_date
2007
I compiled and run my code. This is accurate code. You can try it your own.
data = pd.read_excel('file.xlsx')
If you have any special character or space in column name you can write it in ''
like in the given code:
data = data[data['expire/t'].notnull()]
print (date)
If there is just a single string column name without any space or special character you can directly access it.
data = data[data.expire ! = 0]
print (date)
Just adding another way for DataFrame expanded over all columns:
for column in df.columns:
df = df[df[column]!=0]
Example:
def z_score(data,count):
threshold=3
for column in data.columns:
mean = np.mean(data[column])
std = np.std(data[column])
for i in data[column]:
zscore = (i-mean)/std
if(np.abs(zscore)>threshold):
count=count+1
data = data[data[column]!=i]
return data,count
just to add another solution, particularly useful if you are using the new pandas assessors, other solutions will replace the original pandas and lose the assessors
df.drop(df.loc[df['line_race']==0].index, inplace=True)
If I'm understanding correctly, it should be as simple as:
df = df[df.line_race != 0]