Multivariate Gaussian distribution formula implementation

问题

I have a certain problem while implementing multivariate Gaussian distribution for anomaly detection.

I have referred the formula from Andrew Ng notes

http://www.holehouse.org/mlclass/15_Anomaly_Detection.html

below is the problem I face

Suppose I have a data set with 2 features and m number of training set i.e n=2 and wants to determine my multivariate Gaussian probability p(x;mu;sigma) which should be a [m*1] matrix because it produces estimated Gaussian value by feature correlation.

The problem I face is I am unable to use the formula to produce the matrix [m*1].

I am using Octave as IDE to develop the algorithm.

Below is a snapshot showcasing my problem

enter image description here

Considering the multiplication of the Red boundary equation because the LHS of the red boundary is just a real number

enter image description here

PLEASE HELP ME UNDERSTAND WHERE AM I GOING WRONG

Thanks

回答1:

I think you got the dimensions wrong.

Let's assume you have a 2-dimensional (n=2) data of m instances. We can store this data as a n-by-m matrix in MATLAB (columns are data instances, rows represent features/dimensions). In this case we have:

X the data matrix of size nxm, each instance x = X(:,i) is a vector of size nx1 (column vector in our convention).
mu is the mean vector (mu = mean(X,2)). This is also a column vector of same size as an instance nx1.
sigma is the covariance matrix (sigma = cov(X.')). It has size nxn (it describes how each dimensions co-vary with each other dimension).

So the part that you highlighted in red involves expressions of the following sizes:

 = ([nx1] - [nx1])' * [nxn] * ([nx1] - [nx1])
 = [1xn] * [nxn] * [nx1]
 = 1x1

来源：https://stackoverflow.com/questions/26611816/multivariate-gaussian-distribution-formula-implementation

标签

matlab

machine-learning

octave

cluster-analysis

gaussian