Need a simple example of calculating RMSE with Pandas DataFrame. Providing there is function that returns in cycle true and predicted value:
def fun (data):
Question 1
This depends on the format that data is in. And I'd expect you already have your true values, so this function is just a pass through.
Question 2
With pandas
((df.p - df.x) ** 2).mean() ** .5
With numpy
(np.diff(df.values) ** 2).mean() ** .5
Question 1
I understand you already have a dataframe df. To add the new values in new rows do the following:
for data in set:
trueVal, predVal = fun(data)
auxDf = pd.DataFrame([[predVal, trueVal]], columns = ['p', 'x'])
df.append(auxDf, ignore_index = True)
Question 2
To calculate RMSE using df, I recommend you to use the scikit learn function.
from sklearn.metrics import mean_squared_error
realVals = df.x
predictedVals = df.p
mse = mean_squared_error(realVals, predictedVals)
# If you want the root mean squared error
# rmse = mean_squared_error(realVals, predictedVals, squared = False)
It's very important that you don't have null values in the columns, otherwise it won't work