问题
I have a certain problem while implementing multivariate Gaussian distribution for anomaly detection.
I have referred the formula from Andrew Ng notes
http://www.holehouse.org/mlclass/15_Anomaly_Detection.html
below is the problem I face
Suppose I have a data set with 2 features and m number of training set i.e n=2 and wants to determine my multivariate Gaussian probability p(x;mu;sigma) which should be a [m*1] matrix because it produces estimated Gaussian value by feature correlation.
The problem I face is I am unable to use the formula to produce the matrix [m*1].
I am using Octave as IDE to develop the algorithm.
Below is a snapshot showcasing my problem
Considering the multiplication of the Red boundary equation because the LHS of the red boundary is just a real number
PLEASE HELP ME UNDERSTAND WHERE AM I GOING WRONG
Thanks
回答1:
I think you got the dimensions wrong.
Let's assume you have a 2-dimensional (n=2
) data of m
instances. We can store this data as a n-by-m
matrix in MATLAB (columns are data instances, rows represent features/dimensions). In this case we have:
X
the data matrix of sizenxm
, each instancex = X(:,i)
is a vector of sizenx1
(column vector in our convention).mu
is the mean vector (mu = mean(X,2)
). This is also a column vector of same size as an instancenx1
.sigma
is the covariance matrix (sigma = cov(X.')
). It has sizenxn
(it describes how each dimensions co-vary with each other dimension).
So the part that you highlighted in red involves expressions of the following sizes:
= ([nx1] - [nx1])' * [nxn] * ([nx1] - [nx1])
= [1xn] * [nxn] * [nx1]
= 1x1
来源:https://stackoverflow.com/questions/26611816/multivariate-gaussian-distribution-formula-implementation