问题
I am using Gaussian kernel to estimate a pdf of a data based on the equation
where K(.) is Gaussian kernel, data is a given vector. z is bin from 1 to 256. size of bin is 1.I implemented by matlab code. However, the result show the amplitude of my pdf estimation (blue color) is not similar with real pdf of data. Could you see my code and give me some comment about my code?
MATLAB CODE
function pdf_est=KDE()
close all;
%%Random values of 20 pixels, range=[1 256]
data=randi([1 256],1,20);
%% Estimate histogram%%%%%
pdf_est=zeros(1,256);
z=256;
for i=1:z
for j=1:length(data)
pdf_est(i)=pdf_est(i)+Gaussian(i-data(j));
end
end
%% Plot real histogram 1 to 256; binsize=1;
hold on
plot(imhist(uint8(data))./length(data),'r');
%% Plot histogram estimation
plot(pdf_est./length(data),'b');
hold off
function K=Gaussian(x)
sigma=1;
K=1./(sqrt(2*pi)*sigma)*exp(-x^2./(2*sigma^2));
RESULT BLUE is my result and RED is real pdf
回答1:
You have two issues:
- A 1-unit displacement between blue and red plots.
- The blue spikes are wider and less tall than the red ones.
How to solve each issue:
This is caused by a possible confusion between the data range 0,...,255 and the indexing interval 1,...,256. Since your data represents an 8-bit image, values should be 0,...,255 (not 1,...,256). Your plotted horizontal axis should then be 0,...,255. Same goes for the
i
variable in thefor
line. And then, since Matlab indexing starts at 1, you should usei+1
when indexingpdf_est
.This is normal behaviour. You are assuming unit variance in your kernel. To see taller blue spikes you could reduce
sigma
to make the kernel narrower and taller. But you will never get the exact same height as your data (the necessarysigma
would depend on your data).Actually, you have a tradeoff between height and width, controlled by
sigma
. But the important thing is that the area remains the same for anysigma
. So I suggest plotting the CDF (area) instead of the pdf (area densisty). To do that, plot the accumulated histogram and pdf (using cumsum).
Code modified according to 1:
function pdf_est=KDE()
close all;
%%Random values of 20 pixels, range=[1 256]
data=randi([1 256],1,20)-1; %// changed: "-1"
%% Estimate histogram%%%%%
pdf_est=zeros(1,256);
z=256;
for i=0:z-1 %// changed_ subtracted 1
for j=1:length(data)
pdf_est(i+1)=pdf_est(i+1)+Gaussian(i-data(j)); %// changed: "+1" (twice)
end
end
%% Plot real histogram 1 to 256; binsize=1;
hold on
plot(0:255, imhist(uint8(data))./length(data),'r'); %// changed: explicit x axis
%% Plot histogram estimation
plot(0:255, pdf_est./length(data),'b'); %// changed: explicit x axis
hold off
function K=Gaussian(x)
sigma=1; %// change? Set as desired
K=1./(sqrt(2*pi)*sigma)*exp(-x^2./(2*sigma^2));
Code modified according to 1 and 2:
function pdf_est=KDE()
close all;
%%Random values of 20 pixels, range=[1 256]
data=randi([1 256],1,20)-1; %// changed: "-1"
%% Estimate histogram%%%%%
pdf_est=zeros(1,256);
z=256;
for i=0:z-1 %// changed: subtracted 1
for j=1:length(data)
pdf_est(i+1)=pdf_est(i+1)+Gaussian(i-data(j)); %// changed: "+1" (twice)
end
end
%% Plot real histogram 1 to 256; binsize=1;
hold on
plot(0:255, cumsum(imhist(uint8(data))./length(data)),'r'); %// changed: explicit x axis
%// changed: cumsum
%% Plot histogram estimation
plot(0:255, cumsum(pdf_est./length(data)),'b'); %// changed: explicit x axis
%// changed: cumsum
hold off
function K=Gaussian(x)
sigma=1; %// change? Set as desired
K=1./(sqrt(2*pi)*sigma)*exp(-x^2./(2*sigma^2));
来源:https://stackoverflow.com/questions/28541384/estimate-pdf-of-a-vector-using-gaussian-kernel