Why does scipy.norm.pdf sometimes give PDF > 1? How to correct it?

20,966

It's not a bug. It's not an incorrect result either. Probability density function's value at some specific point does not give you probability; it is a measure of how dense the distribution is around that value. For continuous random variables, the probability at a given point is equal to zero. Instead of p(X = x), we calculate probabilities between 2 points p(x1 < X < x2) and it is equal to the area below that probability density function. Probability density function's value can very well be above 1. It can even approach to infinity.

Share:
20,966
Ébe Isaac
Author by

Ébe Isaac

PhD in Machine Learning. My research interests are anomaly detection, biometrics, gait analysis, and data science. Profiles: ResearchGate LinkedIn

Updated on April 26, 2020

Comments

  • Ébe Isaac
    Ébe Isaac about 4 years

    Given mean and variance of a Gaussian (normal) random variable, I would like to compute its probability density function (PDF).

    enter image description here

    I referred this post: Calculate probability in normal distribution given mean, std in Python,

    Also the scipy docs: scipy.stats.norm

    But when I plot a PDF of a curve, the probability exceeds 1! Refer to this minimum working example:

    import numpy as np
    import scipy.stats as stats
    
    x = np.linspace(0.3, 1.75, 1000)
    plt.plot(x, stats.norm.pdf(x, 1.075, 0.2))
    plt.show()
    

    This is what I get:

    Gaussian PDF Curve

    How is it even possible to have 200% probability to get the mean, 1.075? Am I misinterpreting anything here? Is there any way to correct this?

  • Severin Pappadeux
    Severin Pappadeux almost 8 years
    @ÉbeIsaac to add a point to the answer INTEGRAL of PDF over the interval is equal to 1. But PDF itself might be above 1, below 1, 0. Cannot be negative, of course.
  • AruniRC
    AruniRC almost 5 years
    As a general point, I think most introductory (college level) probability and statistics textbooks do not discuss these issues, and without some exposure to real analysis/measure/Riemann-sums it is not easy to develop an intuition. I found this to be a painless intro: statsathome.com/2017/06/26/…