What is the algorithm used to interpolate in Matlab's imresize function?

12,681

Solution 1

As you can see, in your example, each corner point is one of your original input values.

The intermediate values are derived via linear interpolation in each direction. So for instance, to calculate b(3,2):

  • b(1,2) is 1/3 of the way between b(1,1) and b(1,4). So:

    b(1,2) = (1/3)*b(1,4) + (2/3)*b(1,1)
    
  • b(4,2) is 1/3 of the way between b(4,1) and b(4,4). So:

    b(4,2) = (1/3)*b(4,4) + (2/3)*b(4,1)
    
  • b(3,2) is 2/3 of the way between b(1,2) and b(4,2). So:

    b(3,2) = (2/3)*b(4,2) + (1/3)*b(1,2)
    

Solution 2

The following code shows how to perform bilinear interpolation using INTERP2:

A = [1 2; 3 4];
SCALE = 2;

xi = linspace(1,size(A,2),SCALE*size(A,2));  %# interpolated horizontal positions
yi = linspace(1,size(A,1),SCALE*size(A,1));  %# interpolated vertical positions
[X Y] = meshgrid(1:size(A,2),1:size(A,1));   %# pixels X-/Y-coords
[XI YI] = meshgrid(xi,yi);                   %# interpolated pixels X-/Y-coords
B = interp2(X,Y,A, XI,YI, '*linear');        %# interp values at these positions

the result agrees with your Octave code output:

B =
            1       1.3333       1.6667            2
       1.6667            2       2.3333       2.6667
       2.3333       2.6667            3       3.3333
            3       3.3333       3.6667            4

I should mention that I'm getting different results between MATLAB and Octave IMRESIZE output. For example, this is what I get when I execute the following in MATLAB on the matrix A=[1 2; 3 4]:

>> B = imresize([1 2; 3 4], 2, 'bilinear')
B =
            1         1.25         1.75            2
          1.5         1.75         2.25          2.5
          2.5         2.75         3.25          3.5
            3         3.25         3.75            4

which suggests that MATLAB's implementation is doing something extra... Unfortunately it's not easy to read the IMRESIZE source code, especially since at some point it calls a MEX-compiled function (with no source code form available).

As a side note, there seems to be a older version of this function as well: IMRESIZE_OLD (purely implemented in m-code). From what I could understand, it performs some sort of affine transformation on the image. Perhaps someone more familiar with the technique could shed some light on the subject...


EDIT (2021)

Please refer to John's answer for a more thorough explanation.

Solution 3

TL;DR

The results are wrong, because there was a bug in GNU Octave's imresize. It has been fixed recently for bilinear interpolation (bug report, commit). Scroll down to "Interpolation of pixels" for the correct explanation how images are interpolated.

Interpolation of samples

Let's start with linear interpolation of samples:

  0   1/3  2/3   1
  |    |    |    |
a=1----+----+----2=b

You blend over from a to b by using

f(x) = (1 - x) * a + x * b.

Some examples:

  • f(0) = (1 - 0) * a + 0 * b = a = 1
  • f(1/3) = (1 - 1/3) * a + 1/3 * b = 2/3 + 2/3 = 4/3 = 1.3333
  • f(1/2) = (1 - 1/2) * a + 1/2 * b = (a + b) / 2 = 1.5
  • f(2/3) = (1 - 2/3) * a + 2/3 * b = 1/3 + 4/3 = 5/3 = 1.6667
  • f(1) = (1 - 1) * a + 1 * b = b = 2

This corresponds to the first line of your first example. In bilinear interpolation the simple linear interpolation is used in x or y direction. In general simple linear interpolation is not used in diagonal or any other direction (your first example is a degenerate case).

       0   1/3        1
       |    |    |    |
 0   a=1---f(x)--+----2=b
       |    |    |    |
      -+----+----+----+-
       |    |    |    |
2/3   -+---???---+----+-
       |    |    |    |
 1   c=3---g(x)--+----4=d
       |    |    |    |

How are the other points calculated then? We use the simple linear interpolation for top and bottom row in x direction and then interpolate the results in y direction:

  • In x-direction, top row: f(x) = (1 - x) * a + x * b, example: f(1/3) = 4/3 = 1.3333
  • In x-direction, bottom row: g(x) = (1 - x) * c + x * d, example: g(1/3) = 10/3 = 3.3333
  • In y-direction, interpolation column: h(y) = (1 - y) * f(x) + y * g(x), example: h(2/3) = 8/3 = 2.6667

Looking at the last equation, we could also put in f(x) and g(x) and get:

h(x, y) = (1 - x) * (1 - y) * a + x * (1 - y) * b + (1 - x) * y * c + x * y * d

This is what you got.

In the second example the points are just a bit different, because you transform from 4 points to 6 in each direction:

old: 0         1         2         3 (sample grid)
     |         |         |         |
     +-----+---+-+-----+-+---+-----+
     |     |     |     |     |     |
new: 0    3/5   6/5   9/5  12/5    3 (interpolation grid)

This is valid for x and y direction in your second example. To use the formula above, you must map each square to [0, 1] x [0, 1].

That's theory. Octave uses interp2 for bilinear interpolation internally. To use interp2 you specify the matrix with samples and the grid that defines the interpolation points:

A = [1, 2;
     3, 4];
xi = linspace(1, size(A, 2), 4);
yi = linspace(1, size(A, 1), 4)';
B = interp2(A, xi, yi)

This gives the results you got, but they are wrong!

Interpolation of pixels

The basics for bilinear interpolation as explained above are still valid, but the interpolation grid is wrong. This is because an image does not consist of sample points, but of pixels. Pixels are areas represented by its mean value. So actually the pixels of the image look like this:

    0.5        1        1.5        2        2.5
0.5  +-------------------+-------------------+
     |                   |                   |
     |                   |                   |
     |                   |                   |
 1   |         o         |         o         |
     |                   |                   |
     |                   |                   |
     |                   |                   |
1.5  +-------------------+-------------------+
     |                   |                   |
     |                   |                   |
     |                   |                   |
 2   |         o         |         o         |
     |                   |                   |
     |                   |                   |
     |                   |                   |
2.5  +-------------------+-------------------+

So the top left pixel's area is [0.5, 1.5] x [0.5, 1.5] with its center at (1, 1). And what you want by upscaling with 2 are the following new pixels (in coordinate space of the old grid, since the image covers still the same area):

    0.5  0.75  1   1.25 1.5  1.75  2   2.25 2.5
0.5  +---------+---------+---------+---------+
     |         |         |         |         |
0.75 |    x    |    x    |    x    |    x    |
     |         |         |         |         |
 1   +---------o---------+---------o---------+
     |         |         |         |         |
1.25 |    x    |    x    |    x    |    x    |
     |         |         |         |         |
1.5  +---------+---------+---------+---------+
     |         |         |         |         |
1.75 |    x    |    x    |    x    |    x    |
     |         |         |         |         |
 2   +---------o---------+---------o---------+
     |         |         |         |         |
2.25 |    x    |    x    |    x    |    x    |
     |         |         |         |         |
2.5  +---------+---------+---------+---------+

Now you take the new centers x as interpolation grid and the old centers o as sample grid. You see, that the new border pixels actually need extrapolation. We assume that it extrapolates constant, so we could pad the array to make an interpolation again or limit the interpolation grid. As code using interp2 you can do:

A = [1, 2;
     3, 4];
xi = linspace(0.75, 2.25, 4);
yi = linspace(0.75, 2.25, 4)';
xi(xi < 1) = 1; xi(xi > 2) = 2;
yi(yi < 1) = 1; yi(yi > 2) = 2;
B = interp2(A, xi, yi)

Here is a more general solution (only valid for integer output sizes), inspired by Amro's comments below his answer. If you allow scale factors that result in floating point output size.

On non-integer output sizes the number of new pixels will be such that the last pixel overlaps. So for example with a scale factor of 5/4 = 1.25 the pixel size will be 1 / (5/4) = 4/5 = 0.8. Hence, scaling a 2x2 image with 1.25 yields a 3x3 image. The old pixel centers (sample grid) are at 1 and 2 and the new pixel centers (interpolation grid) are at 0.9, 1.7 and 2.5.

    0.5                 1.5                 2.5
     |         1         |         2         |
old: +---------o---------+---------o---------+
new: +-------x-------+-------x-------+-------x-------+
     |      0.9      |      1.7      |      2.5      |
    0.5             1.3             2.1             2.9

Here is some code to show this:

img = [1, 2;
       3, 4];

% make interpolation grid
scale = 1.25
pixel_size = 1 / scale
out_size = ceil(size(img) / pixel_size)
xi = 0.5 + pixel_size / 2 + (0:out_size(1)-1) / scale
yi = 0.5 + pixel_size / 2 + (0:out_size(2)-1) / scale

% limit interpolation grid to sample grid bounds
xi(xi < 1) = 1;   xi(xi > size(img, 2)) = size(img, 2)
yi(yi < 1) = 1;   yi(yi > size(img, 1)) = size(img, 1)

% interpolate
scaled_interp = interp2(img, xi, yi', 'linear')

% Octave's imresize does not support anti-aliasing yet
scaled_resize_octave = imresize(img, scale, 'bilinear')
% Matlab's imresize uses anti-aliasing for downscaling, switch off to keep is simple
scaled_resize_matlab = imresize(img, scale, 'bilinear', 'Antialiasing', false)

% yields:
%    1.0000    1.7000    2.0000
%    2.4000    3.1000    3.4000
%    3.0000    3.7000    4.0000

That's all for resizing with bilinear interpolation. For bicubic interpolation Matlab uses symmetric padding and a convolutional algorithm that weights a 4x4 neighborhood. Octave behaves differently (patch incoming).

Solution 4

I adapted MATLAB's imresize function for Java:

import java.util.ArrayList;
import java.util.List;

public class MatlabResize {
    private static final double TRIANGLE_KERNEL_WIDTH = 2;

    public static double[][] resizeMatlab(double[][] data, int out_y, int out_x) {
        double scale_x = ((double)out_x)/data[0].length;
        double scale_y = ((double)out_y)/data.length;

        double[][][] weights_indizes = contribution(data.length, out_y, scale_y, TRIANGLE_KERNEL_WIDTH);
        double[][] weights = weights_indizes[0];
        double[][] indices = weights_indizes[1];

        final double[][] result = new double[out_y][data[0].length];
        double value = 0;

        for (int p=0; p<result[0].length; p++) {
            for (int i=0; i<weights.length; i++) {
                value = 0;

                for (int j=0; j<indices[0].length; j++) {
                    value += weights[i][j] * data[(int)indices[i][j]][p];
                }

                result[i][p] = value;
            }
        }

        weights_indizes = contribution(data[0].length, out_x, scale_x, TRIANGLE_KERNEL_WIDTH);
        weights = weights_indizes[0];
        indices = weights_indizes[1];

        final double[][] result2 = new double[result.length][out_x];
        for (int p=0; p<result.length; p++) {
            for (int i=0; i<weights.length; i++) {
                value = 0;

                for (int j=0; j<indices[0].length; j++) {
                    value += weights[i][j] * result[p][(int)indices[i][j]];
                }

                result2[p][i] = value;
            }
        }

        return result2;
    }

    public static double[][] resizeMatlab(double[][] data, double scale) {
        int out_x = (int)Math.ceil(data[0].length * scale);
        int out_y = (int)Math.ceil(data.length * scale);

        double[][][] weights_indizes = contribution(data.length, out_y, scale, TRIANGLE_KERNEL_WIDTH);
        double[][] weights = weights_indizes[0];
        double[][] indices = weights_indizes[1];

        final double[][] result = new double[out_y][data[0].length];
        double value = 0;

        for (int p=0; p<result[0].length; p++) {
            for (int i=0; i<weights.length; i++) {
                value = 0;

                for (int j=0; j<indices[0].length; j++) {
                    value += weights[i][j] * data[(int)indices[i][j]][p];
                }

                result[i][p] = value;
            }
        }

        weights_indizes = contribution(data[0].length, out_x, scale, TRIANGLE_KERNEL_WIDTH);
        weights = weights_indizes[0];
        indices = weights_indizes[1];

        final double[][] result2 = new double[result.length][out_x];
        for (int p=0; p<result.length; p++) {
            for (int i=0; i<weights.length; i++) {
                value = 0;

                for (int j=0; j<indices[0].length; j++) {
                    value += weights[i][j] * result[p][(int)indices[i][j]];
                }

                result2[p][i] = value;
            }
        }

        return result2;
    }


    private static double[][][] contribution(int length, int output_size, double scale, double kernel_width) {
        if (scale < 1.0) {
            kernel_width = kernel_width/scale;
        }

        final double[] x = new double[output_size];
        for (int i=0; i<x.length; i++) {
            x[i] = i+1;
        }

        final double[] u = new double[output_size];
        for (int i=0; i<u.length; i++) {
            u[i] = x[i]/scale + 0.5*(1 - 1/scale);
        }

        final double[] left = new double[output_size];
        for (int i=0; i<left.length; i++) {
            left[i] = Math.floor(u[i] - kernel_width/2);
        }

        int P = (int)Math.ceil(kernel_width) + 2;

        final double[][] indices = new double[left.length][P];
        for (int i=0; i<left.length; i++) {
            for (int j=0; j<=P-1; j++) {
                indices[i][j] = left[i] + j;
            }
        }

        double[][] weights = new double[u.length][indices[0].length];
        for (int i=0; i<u.length; i++) {
            for (int j=0; j<indices[i].length; j++) {
                weights[i][j] = u[i] - indices[i][j];
            }
        }

        if (scale < 1.0) {
            weights = triangleAntiAliasing(weights, scale);
        } else {
            weights = triangle(weights);
        }

        double[] sum = Matlab.sum(weights, 2);
        for (int i=0; i<weights.length; i++) {
            for (int j=0; j<weights[i].length; j++) {
                weights[i][j] = weights[i][j] / sum[i];
            }
        }

        for (int i=0; i<indices.length; i++) {
            for (int j=0; j<indices[i].length; j++) {
                indices[i][j] = Math.min(Math.max(indices[i][j], 1.0), length);
            }
        }

        sum = Matlab.sum(weights, 1);
        int a = 0;

        final List<Integer> list = new ArrayList<Integer>();
        for (int i=0; i<sum.length; i++) {
            if (sum[i] != 0.0) {
                a++;
                list.add(i);
            }
        }

        final double[][][] result = new double[2][weights.length][a];
        for (int i=0; i<weights.length; i++) {
            for (int j=0; j<list.size(); j++) {
                result[0][i][j] = weights[i][list.get(j)];
            }
        }
        for (int i=0; i<indices.length; i++) {
            for (int j=0; j<list.size(); j++) {
                result[1][i][j] = indices[i][list.get(j)]-1; //java indices start by 0 and not by 1
            }
        }

        return result;
    }

    private static double[][] triangle(final double[][] x) {
        for (int i=0; i<x.length; i++) {
            for (int j=0; j<x[i].length; j++) {
                if (-1.0 <= x[i][j] && x[i][j] < 0.0) {
                    x[i][j] = x[i][j] + 1;
                } else if (0.0 <= x[i][j] &&  x[i][j] < 1.0) {
                    x[i][j] = 1 - x[i][j];
                } else {
                    x[i][j] = 0;
                }
            }
        }

        return x;
    }

    private static double[][] triangleAntiAliasing(final double[][] x, final double scale) {
        for (int i=0; i<x.length; i++) {
            for (int j=0; j<x[i].length; j++) {
                x[i][j] = x[i][j] * scale;
            }
        }

        for (int i=0; i<x.length; i++) {
            for (int j=0; j<x[i].length; j++) {
                if (-1.0 <= x[i][j] && x[i][j] < 0.0) {
                    x[i][j] = x[i][j] + 1;
                } else if (0.0 <= x[i][j] &&  x[i][j] < 1.0) {
                    x[i][j] = 1 - x[i][j];
                } else {
                    x[i][j] = 0;
                }
            }
        }

        for (int i=0; i<x.length; i++) {
            for (int j=0; j<x[i].length; j++) {
                x[i][j] = x[i][j] * scale;
            }
        }

        return x;
    }
}
Share:
12,681
goldenmean
Author by

goldenmean

struct descriptionOf { int elligent_Developer; short list_of_proven_skills; long list_of_ambitions; long long int erest_in_various_technologies; double effort_in_achieving_skills_and_ambitions; float ing_innovator; char of_a_hands_on_doer; }goldenmean; Software Developer with work experience in areas of Video/Image processing and codecs,DSP and multimedia Systems,on DSP/Multicore processor architecures, for devices and applications in Consumer Electronics, Communications industry. Programming languages: C,C++,Matlab/Octave,Python,DSP or RISC assembly languages.

Updated on June 05, 2022

Comments

  • goldenmean
    goldenmean almost 2 years

    I am using the Matlab/Octave imresize() function which resamples a given 2D array. I want to understand how a particular interpolation algorithm used in imresize works.

    (I am using octave on windows)

    e.g.

    A =  1 2 
         3 4
    

    is a 2D array. Then I use the command

    b=imresize(a,2,'linear'); 
    

    basically upsampling row and columns by 2.

    The output is

    1.0000   1.3333   1.6667   2.0000
    1.6667   2.0000   2.3333   2.6667
    2.3333   2.6667   3.0000   3.3333
    3.0000   3.3333   3.6667   4.0000
    

    I don't understand how this linear interpolation is working. It is said to use bilinear interpolation, but how does it pad the data at boundaries and how does it get the output that it is getting?

    Second example: For

    A = 
    
    1   2   3   4
    5   6   7   8
    0   1   2   3
    1   2   3   4
    

    how does imresize(a,1.5,'linear') give the following output?

    1.00000   1.60000   2.20000   2.80000   3.40000   4.00000
    3.40000   4.00000   4.60000   5.20000   5.80000   6.40000
    4.00000   4.60000   5.20000   5.80000   6.40000   7.00000
    1.00000   1.60000   2.20000   2.80000   3.40000   4.00000
    0.40000   1.00000   1.60000   2.20000   2.80000   3.40000
    1.00000   1.60000   2.20000   2.80000   3.40000   4.00000