I am interested in Applying Henze-Zirkler's Multivariate Normality Test in python 3x and I was wondering if I may do so in python in Jupyter notebook.
I have fitted a VAR model with my data and the then I would like to test whether the residuals from this fitted VAR model are normally distributed.
How may I do so in Jupyter notebook using python?
This is another answer since I discover this method later. If you do not want to import the library of R into Python. One may call the output of R to python. i.e. one is capable of activating R function through python as follow:
import rpy2.robjects as robjects
from rpy2.robjects import r
from rpy2.robjects.numpy2ri import numpy2ri
from rpy2.robjects.packages import importr
import numpy as np
suppose that resi is a Dataframe in python say
# Create data
resi = pd.DataFrame(np.random.random((108, 2)), columns=['Number1','Number2'])
Then the code is as follow
#Converting the dataframe from python to R
# firt take the values of the dataframe to numpy
resi1=np.array(resi, dtype=float)
# Taking the variable from Python to R
r_resi = numpy2ri(resi1)
# Creating this variable in R (from python)
r.assign("resi", r_resi)
# Calling libraries in R
r('library("MVN")')
# Calling a function in R (from python)
r("res <- hzTest(resi, qqplot = F)")
# Retrieving information from R to Python
r_result = r("res")
# Printing the output in python
print(r_result)
This will generate the output:
Henze-Zirkler's Multivariate Normality Test
---------------------------------------------
data : resi
HZ : 2.841424
p-value : 1.032563e-06
Result : Data are not multivariate normal.
---------------------------------------------
There is a package in R that already does this test and it is called MVN
The first thing you have to do is to import MVN into python as described in here
Then go to your jupyter notebook and fit the VAR(1) model to your data as so
# Fit VAR(1) Model
results = Model.fit(1)
results.summary()
Store the residuals as resi
resi=results.resid
Then
# Call function from R
import os
os.environ['R_USER'] = '...\Lib\site-packages\rpy2'
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
pandas2ri.activate()
from rpy2.robjects.packages import importr
MVN = importr("MVN", lib_loc = "C:/.../R/win-library/3.3")
After importing MVN you can simply do the normality test as so
MVNresult =MVN.hzTest(resi, qqplot = 0)
If you press on
type(MVNresult)
you will find that it is an
rpy2.robjects.methods.RS4
Therefore, in this case you will find this link a very powerful in explaining the details
Then afterwards
tuple(MVNresult.slotnames())
This will show you the observations
('HZ', 'p.value', 'dname', 'dataframe')
Then you may get the values as so
np.array(MVNresult.slots[tuple(MVNresult.slotnames())[i]])[0]
where i
stands for 0, 1, 2, 3
as 'HZ', 'p-value',...
So in case the p-value i.e. i=1
is less than 0.05 then residuals (resi) are not multivariate normal at 5% confidence level.
来源:https://stackoverflow.com/questions/46266215/how-to-apply-henze-zirklers-multivariate-normality-test-in-jupyter-notebook-wit