Determining probability mass function of random variable

后端 未结 6 660
一整个雨季
一整个雨季 2021-01-13 04:01

If we have a discrete random variable x and the data pertaining to it in X(n), how in matlab can we determine the probability mass function pmf(X)?

相关标签:
6条回答
  • 2021-01-13 04:33

    The following excerpt from the MATLAB documentation shows how to plot a histogram. For a discrete probability function, the frequency distribution might be identical with the histogram.

    x = -4:0.1:4;
    y = randn(10000,1);
    n = hist(y,x);
    pmf = n/sum(n);
    plot(pmf,'o');
    

    Calculate the sum of all the elements in every bin. Divide all bins by the sum to get your pdf. Test your pdf by adding up all elements. The result must be one.

    Hope I'm right with my statements. It's a long time since ...

    0 讨论(0)
  • 2021-01-13 04:35

    You can do this in at least eight different ways (some of them were already mentioned in the other solutions).

    Say we have a sample from a discrete random variable:

    X = randi([-9 9], [100 1]);
    

    Consider these equivalent solutions (note that I don't assume anything about the range of possible values, just that they are integers):

    [V,~,labels] = grp2idx(X);
    mx = max(V);
    
    %# TABULATE (internally uses HIST)
    t = tabulate(V);
    pmf1 = t(:, 3) ./ 100;
    
    %# HIST (internally uses HISTC)
    pmf2 = hist(V, mx)' ./ numel(V);                      %#'
    
    %# HISTC
    pmf3 = histc(V, 1:mx) ./ numel(V);
    
    %# ACCUMARRAY
    pmf4 = accumarray(V, 1) ./ numel(V);
    
    %# SORT/FIND/DIFF
    pmf5 = diff( find( [diff([0;sort(V)]) ; 1] ) ) ./ numel(V);
    
    %# SORT/UNIQUE/DIFF
    [~,idx] = unique( sort(V) );
    pmf6 = diff([0;idx]) ./ numel(V);
    
    %# ARRAYFUN
    pmf7 = arrayfun(@(x) sum(V==x), 1:mx)' ./ numel(V);   %#'
    
    %# BSXFUN
    pmf8 = sum( bsxfun(@eq, V, 1:mx) )' ./ numel(V);      %#'
    

    note that GRP2IDX was used to get indices starting at 1 corresponding to the entries of pmf (the mapping is given by labels). The result of the above is:

    >> [labels pmf]
    ans =
               -9         0.03
               -8         0.07
               -7         0.04
               -6         0.07
               -5         0.03
               -4         0.06
               -3         0.05
               -2         0.05
               -1         0.06
                0         0.05
                1         0.04
                2         0.07
                3         0.03
                4         0.09
                5         0.08
                6         0.02
                7         0.03
                8         0.08
                9         0.05
    
    0 讨论(0)
  • 2021-01-13 04:37

    If I understood correctly what you need to do is to estimate the pdf, except it is not continuous but discrete values.

    Calculate the occurrences of different values in X(n) and divide by n. To illustrate what I am saying, please allow me to give an example. Assume that you have 10 observations:

    X = [1 1 2 3 1 9 12 3 1 2]
    

    then your pmf would look like this:

    pmf(X) = [0.4 0.2 0.2 0 0 0 0 0 0.1 0 0 0.1]
    

    edit: this is in principle a frequency histogram, as @zellus has also pointed out

    0 讨论(0)
  • 2021-01-13 04:40

    Maybe try making just a function handle so you don't need to store another array:

    pmf = @(x) arrayfun(@(y) nnz(DATA==y)/length(DATA),x);
    
    0 讨论(0)
  • 2021-01-13 04:44

    How about this function?

    function Y = pmf(X)
    A=tabulate(X)
    A(:,3)=A(:,3)/100
    Y=A(:,3)'
    

    Is this correct in your opinion?

    0 讨论(0)
  • 2021-01-13 04:46

    To add yet another option (since there are a number of functions available to do what you want), you could easily compute the pmf using the function ACCUMARRAY if your discrete values are integers greater than 0:

    pmf = accumarray(X(:),1)./numel(X);
    

    Here's an example:

    >> X = [1 1 1 1 2 2 2 3 3 4];          %# A sample distribution of values
    >> pmf = accumarray(X(:),1)./numel(X)  %# Compute the probability mass function
    
    pmf =
    
        0.4000      %# 1 occurs 40% of the time
        0.3000      %# 2 occurs 30% of the time
        0.2000      %# 3 occurs 20% of the time
        0.1000      %# 4 occurs 10% of the time
    
    0 讨论(0)
提交回复
热议问题