gaussian fit with scipy.optimize.curve_fit in python with wrong results

21,912

Your problem is with the initial parameters of the curve_fit. By default, if no other information is given, it will start with an array of 1, but this obviously lead to a radically wrong result. This can be corrected simply by giving a reasonable starting vector. To do this, I start from the estimated mean and standard deviation of your dataset

#estimate mean and standard deviation
meam = sum(x * y)
sigma = sum(y * (x - m)**2)
#do the fit!
popt, pcov = curve_fit(gauss_function, x, y, p0 = [1, mean, sigma])
#plot the fit results
plot(x,gauss_function(x, *popt))
#confront with the given data
plot(x,y,'ok')

This will perfectly approximate your results. Remember that curve fitting in general cannot work unless you start from a good point (inside the convergence basin, to be clear), and this doesn't depend on the implementation. Never do blind fit when you can use your knowledge!

Share:
21,912
Álvaro
Author by

Álvaro

Updated on July 09, 2022

Comments

  • Álvaro
    Álvaro almost 2 years

    I am having some trouble to fit a gaussian to data. I think the problem is that most of the elements are close to zero, and there not many points to actually be fitted. But in any case, I think they make a good dataset to fit, and I don't get what is confussing python. Here is the program, I have also added a line to plot the data so you can see what I am trying to fit

    #Gaussian function
    def gauss_function(x, a, x0, sigma):
        return a*np.exp(-(x-x0)**2/(2*sigma**2))
    
    # program
    from scipy.optimize import curve_fit
    x = np.arange(0,21.,0.2)
    # sorry about these data!
    y = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.2888599818864958e-275, 1.0099964933708256e-225, 4.9869496866403137e-184, 4.4182929795060327e-149, 7.2953754336628778e-120, 1.6214815763354974e-95, 2.5845990267696154e-75, 1.2195550372375896e-58, 5.6756631456872126e-45, 7.2520963306599953e-34, 6.0926453402093181e-25, 7.1075523112494745e-18, 2.1895584709541657e-12, 3.1040093615952226e-08, 3.2818874974043519e-05, 0.0039462011337049593, 0.077653596114448178, 0.33645159419151383, 0.40139213808285212, 0.15616093582013874, 0.0228751827752081, 0.0014423440677009125, 4.4400754532288282e-05, 7.4939123408714068e-07, 7.698340466102054e-09, 5.2805658851032628e-11, 2.6233358880470556e-13, 1.0131613609937094e-15, 3.234727006243684e-18, 9.0031014316344088e-21, 2.2867065482392331e-23, 5.5126221075296919e-26, 1.3045106781768978e-28, 3.1185031969890313e-31, 7.7170036365830092e-34, 2.0179753504732056e-36, 5.6739187799428708e-39, 1.7403776988666581e-41, 5.8939645426573027e-44, 2.2255784749636281e-46, 9.4448944519959299e-49, 4.5331936383388069e-51, 2.4727435506007072e-53, 1.5385048936078214e-55, 1.094651071873419e-57, 8.9211199390945735e-60, 8.3347561634783632e-62, 8.928140776588251e-64, 1.0960564546383266e-65, 1.5406342485015278e-67, 2.4760905399114866e-69, 4.5423744881977258e-71, 9.4921949220625905e-73, 2.2543765002199549e-74, 6.0698995872666723e-76, 1.8478996852922248e-77, 6.3431644488676084e-79, 0.0, 0.0, 0.0, 0.0]
    
    plot(x,y) #Plot the curve, the gaussian is quite clear
    plot(x,y,'ok') #Overplot the dots
    
    # Try to fit the result
    popt, pcov = curve_fit(gauss_function, x, y)
    

    The problem is that the results for popt is

    print popt
    array([  7.39717176e-10,   1.00000000e+00,   1.00000000e+00])
    

    Any hint on why this could be happening?

    Thanks!

  • Ciprian Tomoiagă
    Ciprian Tomoiagă almost 7 years
    It works, but why ? Why do you start with mean = sum(x * y) ?