Suppose I have a pandas data frame surveyData:
I want to normalize the data in each column by performing:
surveyData_norm = (surveyData - surveyData.mean
Simple way and way more efficient:
Pre-calculate the mean:
dropna()
avoid missing data.
mean_age = survey_data.Age.dropna().mean()
max_age = survey_data.Age.dropna().max()
min_age = survey_data.Age.dropna().min()
dataframe['Age'] = dataframe['Age'].apply(lambda x: (x - mean_age ) / (max_age -min_age ))
this way will work...