How to draw probability density function in MatLab?

76,932

Solution 1

You can generate a discrete probability distribution for your integers using the function hist:

data = [1 2 3 3 4];           %# Sample data
xRange = 0:10;                %# Range of integers to compute a probability for
N = hist(data,xRange);        %# Bin the data
plot(xRange,N./numel(data));  %# Plot the probabilities for each integer
xlabel('Integer value');
ylabel('Probability');

And here's the resulting plot:

enter image description here


UPDATE:

In newer versions of MATLAB the hist function is no longer recommended. Instead, you can use the histcounts function like so to produce the same figure as above:

data = [1 2 3 3 4];
N = histcounts(data, 'BinLimits', [0 10], 'BinMethod', 'integers', 'Normalization', 'pdf');
plot(N);
xlabel('Integer value');
ylabel('Probability');

Solution 2

If you want a continuous distribution function, try this.

x = [1 2 3 3 4]
subplot(2,1,1)
ksdensity(x)
axis([-4 8 0 0.4])

subplot(2,1,2)
cdfplot(x)
grid off
axis([-4 8 0 1])
title('')

Which outputs this. enter image description here

The Cumulative Distribution Function is on the bottom, the Kernel Density Estimate on the top.

Solution 3

type "ksdensity" in matlab help and you will find out the function that will give you the continuous form of PDF. I guess this is exactly what you are looking for.

Share:
76,932
Haozhun
Author by

Haozhun

An engineer in the Bay Area. Learn more about me at LinkedIn profile

Updated on December 17, 2020

Comments

  • Haozhun
    Haozhun over 3 years
    x = [1 2 3 3 4]
    cdfplot(x)
    

    After Googling, I find the above code will draw a cumulative distribution function for me in Matlab.
    Is there a simple way to draw a probability density function?

    To Clarify. I need a graph that has an evenly distributed x-axis. And I would prefer it does not look like a bar graph. (I would have millions of integers)
    Sorry, update again. My data are integers, but actually they represents time(I expect several quite high peak at exact same value while other value should look like as if they are not discrete). I'm actually starting to wonder if this is actually not discrete integers inherently. CDF would definitely work, but when coming to PDF, it seems it's more complicated than I anticipated.

  • abcd
    abcd about 13 years
    @gnovice: just a minor point that you should, in general, divide by the area of the histogram and not the number of data points to get a pdf. So the last line should read bar(X,N/trapz(X,N)). Since in this example, the bin points are integers and unit spaced, both numel and trapz give the same answer, 4, but if this is not the case, they will be different.
  • gnovice
    gnovice about 13 years
    @yoda: You are correct, but Gene mentioned having to do this for integer values (i.e. a discrete probability distribution) so I thought I'd keep it simple.
  • Haozhun
    Haozhun about 13 years
    Thank you for your answer, I've got one more question, gnovice. @yoda's comment raised my concern. Will this still work correctly if x=[100 200 400 400 550]
  • Haozhun
    Haozhun about 13 years
    I'll try both on my actual data. Thank you all!
  • abcd
    abcd about 13 years
    @Gene: Yes it will. I'm sorry if my comment confused you, but to see what I meant, you could take a look at my answer to an earlier question on normalizing histograms. If you run the code in there, it will illustrate the point I was trying to make. If all you have are discrete integers, then you'll be fine with dividing by numel. In either case, trapz will give you the correct answer.
  • gnovice
    gnovice about 13 years
    @Gene: If you had data = [100 200 400 400 550]; and specified a range of integers like xRange = 0:600;, you would get a plot that was mostly 0 except for spikes of 0.2 when x equals 100, 200, and 550 and a spike of 0.4 when x equals 400. As an alternative way to display your data, you may want to try a STEM plot instead of a regular line plot. It may look better.
  • Haozhun
    Haozhun about 13 years
    @yoda and gnovice: My data are integers, but actually they represents time(I expect several quite high peak at exact same value while other value should look like as if they are not discrete). I'm actually starting to wonder if this is actually not discrete integers inherently. CDF would definitely work, but when coming to PDF, it seems it's more complicated than I anticipated. Do you have any idea?
  • mwoua
    mwoua almost 11 years
    @gnovice : it has been long since you have answered this question but how could I do if I haven't integers on the x axis ? Thanks a lot :) cdfplot and ksdensity don't work in my version of matlab