Fitting data using UnivariateSpline in scipy python

33,053

There are a few issues.

The first issue is the order of the x values. From the documentation for scipy.interpolate.UnivariateSpline we find

x : (N,) array_like
    1-D array of independent input data. MUST BE INCREASING.

Stress added by me. For the data you have given the x is in the reversed order. To debug this it is useful to use a "normal" spline to make sure everything makes sense.

The second issue, and the one more directly relevant for your issue, relates to the s parameter. What does it do? Again from the documentation we find

s : float or None, optional
    Positive smoothing factor used to choose the number of knots.  Number
    of knots will be increased until the smoothing condition is satisfied:

    sum((w[i]*(y[i]-s(x[i])))**2,axis=0) <= s

    If None (default), s=len(w) which should be a good value if 1/w[i] is
    an estimate of the standard deviation of y[i].  If 0, spline will
    interpolate through all data points.

So s determines how close the interpolated curve must come to the data points, in the least squares sense. If we set the value very large then the spline does not need to come near the data points.

As a complete example consider the following

import scipy.interpolate as inter
import numpy as np
import pylab as plt

x = np.array([13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1])
y = np.array([2.404070, 1.588134, 1.760112, 1.771360, 1.860087,
          1.955789, 1.910408, 1.655911, 1.778952, 2.624719,
          1.698099, 3.022607, 3.303135])
xx = np.arange(1,13.01,0.1)
s1 = inter.InterpolatedUnivariateSpline (x, y)
s1rev = inter.InterpolatedUnivariateSpline (x[::-1], y[::-1])
# Use a smallish value for s
s2 = inter.UnivariateSpline (x[::-1], y[::-1], s=0.1)
s2crazy = inter.UnivariateSpline (x[::-1], y[::-1], s=5e8)
plt.plot (x, y, 'bo', label='Data')
plt.plot (xx, s1(xx), 'k-', label='Spline, wrong order')
plt.plot (xx, s1rev(xx), 'k--', label='Spline, correct order')
plt.plot (xx, s2(xx), 'r-', label='Spline, fit')
# Uncomment to get the poor fit.
#plt.plot (xx, s2crazy(xx), 'r--', label='Spline, fit, s=5e8')
plt.minorticks_on()
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.show()

Result from example code

Share:
33,053
Prakhar Mehrotra
Author by

Prakhar Mehrotra

Analyst @ Twitter

Updated on August 06, 2022

Comments

  • Prakhar Mehrotra
    Prakhar Mehrotra over 1 year

    I have a experimental data to which I am trying to fit a curve using UnivariateSpline function in scipy. The data looks like:

     x         y
    13    2.404070
    12    1.588134
    11    1.760112
    10    1.771360
    09    1.860087
    08    1.955789
    07    1.910408
    06    1.655911
    05    1.778952
    04    2.624719
    03    1.698099
    02    3.022607
    01    3.303135    
    

    Here is what I am doing:

    import matplotlib.pyplot as plt
    from scipy import interpolate
    yinterp = interpolate.UnivariateSpline(x, y, s = 5e8)(x) 
    plt.plot(x, y, 'bo', label = 'Original')
    plt.plot(x, yinterp, 'r', label = 'Interpolated')
    plt.show()
    

    That's how it looks:

    Curve fit

    I was wondering if anyone has thought on other curve fitting options which scipy might have? I am relatively new to scipy.

    Thanks!

  • Prakhar Mehrotra
    Prakhar Mehrotra almost 11 years
    Thanks for explaining the meaning of smoothing parameter s, and for pointing the incorrect order. It works fine!
  • Prakhar Mehrotra
    Prakhar Mehrotra almost 11 years
    If I impose the condition that spline needs to be monotonically decreasing, does UnivariateSpline let me do that? Thanks!
  • Craig J Copi
    Craig J Copi almost 11 years
    @PrakharMehrotra I don't understand the question. The implementation of the spline requires that x be increasing. As done in the example, it is simple to reverse arrays when they are in the opposite of the required order.
  • Mark Mikofski
    Mark Mikofski almost 9 years
  • diegus
    diegus over 8 years
    I have tried to use s=0 and the (Spline, fit) coincides with the (Spline, correct order), i.e. both splines pass well through the points, while when using s=0.1, like in your example the fit does not seem right. So, what is the point of using s>0 ?
  • Sergio
    Sergio over 8 years
    @PrakharMehrotra Splines aren't monotonic. You will need something like Piecewise Cubic Hermite Interpolating Polynomial