Using the gaussian probability density function in C++

c++ distribution probability gaussian normal-distribution

27,532

Solution 1

yes. boost::random has gaussian distribution.

See, for example, this question: How to use boost normal distribution classes?

As an alternative, there's a standard way of converting two uniformly distributed random numbers into two normally distributed numbers.

See, e.g. this question: Generate random numbers following a normal distribution in C/C++

In response to your last edit (note that the question is completely different as edited, hence my answer to an original one is irrelevant). I think you'd better off first formulating for yourself what exactly do you mean to mean by "modelling far or not far using a gaussian distribution". Then reformulate that understanding in math terms and only then start programming. As it stands, I think the problem is underspecified.

Solution 2

Technically,

float pdf_gaussian = ( 1 / ( s * sqrt(2*M_PI) ) ) * exp( -0.5 * pow( (x-m)/s, 2.0 ) );

is not incorrect, but can be improved.

First, 1 / sqrt(2 Pi) can be precomputed, and using pow with integers is not a good idea: it may use exp(2 * log x) or a routine specialized for floating point exponents instead of simply x * x.

Example better code:

float normal_pdf(float x, float m, float s)
{
    static const float inv_sqrt_2pi = 0.3989422804014327;
    float a = (x - m) / s;

    return inv_sqrt_2pi / s * std::exp(-0.5f * a * a);
}

You may want to make this a template instead of using float:

template <typename T>
T normal_pdf(T x, T m, T s)
{
    static const T inv_sqrt_2pi = 0.3989422804014327;
    T a = (x - m) / s;

    return inv_sqrt_2pi / s * std::exp(-T(0.5) * a * a);
}

this allows you to use normal_pdf on double arguments also (it is not that much more generic though). There are caveats with the last code, namely that you have to beware not using it with integers (there are workarounds, but this makes the routine more verbose).

Solution 3

Use Box-Muller transform. This creates values with a normal/gaussian distribution.

http://en.wikipedia.org/wiki/Normal_distribution#Generating_values_from_normal_distribution

http://en.wikipedia.org/wiki/Box_Muller_transform

It's not very complex to code using math libraries.

eg.

Generate 2 uniform numbers, use them to get two normally distributed numbers. Then Return one and save the other so that you have it for your 'next' request of a random number.

27,532

Author by

shn

Updated on April 23, 2020

Comments

shn about 4 years
First, is this the correct C++ representation of the pdf gaussian function ?
```
float pdf_gaussian = ( 1 / ( s * sqrt(2*M_PI) ) ) * exp( -0.5 * pow( (x-m)/s, 2.0 ) );
```
Second, does it make sense of we do something like this ?
```
if(pdf_gaussian < uniform_random())
   do something
else
   do other thing
```
EDIT: An example of what exactly are you trying to achieve:

Say I have a data called Y1. Then a new data called Xi arrive. I want to see if I should associated Xi to Y1 or if I should keep Xi as a new data data that will be called Y2. This is based on the distance between the new data Xi and the existing data Y1. If Xi is "far" from Y1 then Xi will not be associated to Y1, otherwise if it is "not far", it will be associated to Y1. Now I want to model this "far" or "not far" using a gaussian probability based on the mean and stdeviation of distances between Y and the data that where already associated to Y in the past.
shn almost 12 years

but not the probability density function that take directly as input the mean and standard deviation and the value x ?
shn almost 12 years

Well maybe it is better to implement directly the equation as described here en.wikipedia.org/wiki/Normal_distribution instead of using boost ?
ev-br almost 12 years

do you need a normally distributed random number or the pdf itself? In the latter case, of course, type in the formula for the distribution and be done with it
shn almost 12 years

I need the pdf function, so is this the correct C++ representation of the equation ? ( 1 / ( s * sqrt(2*M_PI) ) ) * exp( -0.5 * pow( (x-m)/s, 2.0 ) );
ev-br almost 12 years

then (a) your title is misleading. (b) why don't you code this one-liner in, and compare a result for some parameters with your calculator's result.
shn almost 12 years

Does it make sense if we do something like: float pdf = ( 1 / ( s * sqrt(2*M_PI) ) ) * exp( -0.5 * pow( (x-m)/s, 2.0 ) ); if(pdf < uniform_random()) {do something} else {do other thing} ?
ev-br almost 12 years

what exactly are you trying to achieve?
shn almost 12 years

well say I have a data called Y1. Then a new data called Xi arrive. I want to see if I should associated Xi to Y1 or if I should keep Xi as a new data data that will be called Y2. This is based on the distance between the new data Xi and the existing data Y1. If Xi is "far" from Y1 then Xi will not be associated to Y1, otherwise if it is "not far", it will be associated to Y1. Now I want to model this "far" or "not far" using a gaussian probability based on the mean and stdeviation of distances between Y and the data that where already associated to Y in the past.
shn almost 12 years

but I don't understand why the gaussian pdf function returns some small values between 0 and 1, compared to the uniform distribution ? did you see the example that I gave about: if(pdf_gaussian < uniform_random()) do something else do other thing Or should I compare pdf_gaussian with normal_random() instead of uniform_random() ?