SciPy interpolation of large matrix

10,926

Solution 1

As your data is on a grid, you can use RectBivariateSpline.

To work with complex numbers, you can interpolate data.real and data.imag separately (the FITPACK routines IIRC don't handle complex data).

Solution 2

edit: Whoops. Just realized the OP suggested this solution in the question!

I don't know why the interpolation routines take so much time and memory to find the knots of structured data, but since you are only using a small parts of the full grid, you could break up your interpolation into patches to make things more efficient.

from scipy import interpolate
import numpy as np

def my_interp(X, Y, Z, x, y, spn=3):
    xs,ys = map(np.array,(x,y))
    z = np.zeros(xs.shape)
    for i,(x,y) in enumerate(zip(xs,ys)):
        # get the indices of the nearest x,y
        xi = np.argmin(np.abs(X[0,:]-x))
        yi = np.argmin(np.abs(Y[:,0]-y))
        xlo = max(xi-spn, 0)
        ylo = max(yi-spn, 0)
        xhi = min(xi+spn, X[0,:].size)
        yhi = min(yi+spn, Y[:,0].size)
        # make slices of X,Y,Z that are only a few items wide
        nX = X[xlo:xhi, ylo:yhi]
        nY = Y[xlo:xhi, ylo:yhi]
        nZ = Z[xlo:xhi, ylo:yhi]
        intp = interpolate.interp2d(nX, nY, nZ)
        z[i] = intp(x,y)[0]
    return z

N = 1000
X,Y = np.meshgrid(np.arange(N), np.arange(N))
Z = np.random.random((N, N))

print my_interp(X, Y, Z, [13.2, 999.9], [0.01, 45.3])

Solution 3

Building the scipy interp2d interpolator on the relatively big datasets may take a long time when initialization happens via passing a meshed set of coordinates. In the case when data is on a rectangular grid you can consider another way to initialize interp2d:

a)

from scipy.interpolate import interp2d    
x = [0,1,2]
y = [0,3]
z = [[1,2,3], [4,5,6]]
i = interp2d(x, y, z)
i(0, 0)[0]

instead of b)

from scipy.interpolate import interp2d 
x = [0, 1, 2, 0, 1, 2]
y = [0, 0, 0, 3, 3, 3]
z = [1, 2, 3, 4, 5, 6]
i = interp2d(x, y, z)
i(0, 0)[0]

It is accounted for in interp2d implementation. Case a) starts significantly faster, but it is suitable for rectangular grids only. I had a performance boost from 6 mins to 3 secs after applying this trick on the grid with 227000 points.

RectBivariateSpline also works well.

Share:
10,926
Joma
Author by

Joma

SOreadytohelp

Updated on June 04, 2022

Comments

  • Joma
    Joma almost 2 years

    I have a ndarray (Z) with some 500000 elements on a rectangular grid (X, Y).

    Now I want to interpolate values at some 100 locations in x,y which are not necessarily on the grid.

    I have some code working in Matlab:

    data = interp2(X,Y,Z, x,y);
    

    However, when I try to use the same approach with scipy.interpolate I get various errors depending on the method. For example interp2d fails with MemoryError if i specify kind = 'linear' and "OverflowError: Too many data points to interpolate" if I specify kind='cubic'. I also tried Rbf and bisplev but they also fail.

    So the question is: Is there an interpolation function which allows for interpolations of large matrices? Is there another solution to the problem? (Or do I have to code a function which picks the suitable area around the points to interpolate and calls then interp2d?)

    In addition: How to do this with complex numbers?