How to apply Henze-Zirkler's Multivariate Normality Test in Jupyter notebook with rpy2

I am interested in Applying Henze-Zirkler's Multivariate Normality Test in python 3x and I was wondering if I may do so in python in Jupyter notebook.

I have fitted a VAR model with my data and the then I would like to test whether the residuals from this fitted VAR model are normally distributed.

How may I do so in Jupyter notebook using python?

This is another answer since I discover this method later. If you do not want to import the library of R into Python. One may call the output of R to python. i.e. one is capable of activating R function through python as follow:

import rpy2.robjects as robjects
from rpy2.robjects import r
from rpy2.robjects.numpy2ri import numpy2ri
from rpy2.robjects.packages import importr
import numpy as np

suppose that resi is a Dataframe in python say

# Create data
resi = pd.DataFrame(np.random.random((108, 2)), columns=['Number1','Number2'])

Then the code is as follow

#Converting the dataframe from python to R

# firt take the values of the dataframe to numpy
resi1=np.array(resi, dtype=float)

# Taking the variable from Python to R
r_resi = numpy2ri(resi1)

# Creating this variable in R (from python)
r.assign("resi", r_resi)

# Calling libraries in R 
r('library("MVN")')

# Calling a function in R (from python)
r("res <- hzTest(resi, qqplot = F)")

# Retrieving information from R to Python
r_result = r("res")

# Printing the output in python
print(r_result)

This will generate the output:

 Henze-Zirkler's Multivariate Normality Test 

--------------------------------------------- 

  data : resi 



  HZ      : 2.841424 

  p-value : 1.032563e-06 



  Result  : Data are not multivariate normal. 

---------------------------------------------

There is a package in R that already does this test and it is called MVN

The first thing you have to do is to import MVN into python as described in here

Then go to your jupyter notebook and fit the VAR(1) model to your data as so

# Fit VAR(1) Model

results = Model.fit(1)
results.summary()

Store the residuals as resi

resi=results.resid

Then

# Call function from R
import os
os.environ['R_USER'] = '...\Lib\site-packages\rpy2'
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
pandas2ri.activate()

from rpy2.robjects.packages import importr

MVN = importr("MVN", lib_loc = "C:/.../R/win-library/3.3")

After importing MVN you can simply do the normality test as so

MVNresult =MVN.hzTest(resi, qqplot = 0)

If you press on

type(MVNresult)

you will find that it is an

rpy2.robjects.methods.RS4

Therefore, in this case you will find this link a very powerful in explaining the details

Then afterwards

tuple(MVNresult.slotnames())

This will show you the observations

('HZ', 'p.value', 'dname', 'dataframe')

Then you may get the values as so

np.array(MVNresult.slots[tuple(MVNresult.slotnames())[i]])[0]

where i stands for 0, 1, 2, 3 as 'HZ', 'p-value',...

So in case the p-value i.e. i=1 is less than 0.05 then residuals (resi) are not multivariate normal at 5% confidence level.

来源：https://stackoverflow.com/questions/46266215/how-to-apply-henze-zirklers-multivariate-normality-test-in-jupyter-notebook-wit

标签

python-3.x

statistics

jupyter-notebook

multivariate-testing