Convolve2d just by using Numpy
Solution 1
You could generate the subarrays using as_strided
:
import numpy as np
a = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
sub_shape = (3,3)
view_shape = tuple(np.subtract(a.shape, sub_shape) + 1) + sub_shape
strides = a.strides + a.strides
sub_matrices = np.lib.stride_tricks.as_strided(a,view_shape,strides)
To get rid of your second "ugly" sum, alter your einsum
so that the output array only has j
and k
. This implies your second summation.
conv_filter = np.array([[0,-1,0],[-1,5,-1],[0,-1,0]])
m = np.einsum('ij,ijkl->kl',conv_filter,sub_matrices)
# [[ 6 7 8]
# [11 12 13]
# [16 17 18]]
Solution 2
Cleaned up using as_strided
and @Crispin 's einsum
trick from above. Enforces the filter size into the expanded shape. Should even allow non-square inputs if the indices are compatible.
def conv2d(a, f):
s = f.shape + tuple(np.subtract(a.shape, f.shape) + 1)
strd = numpy.lib.stride_tricks.as_strided
subM = strd(a, shape = s, strides = a.strides * 2)
return np.einsum('ij,ijkl->kl', f, subM)
Solution 3
You can also use fft (one of the faster methods to perform convolutions)
from numpy.fft import fft2, ifft2
import numpy as np
def fft_convolve2d(x,y):
""" 2D convolution, using FFT"""
fr = fft2(x)
fr2 = fft2(np.flipud(np.fliplr(y)))
m,n = fr.shape
cc = np.real(ifft2(fr*fr2))
cc = np.roll(cc, -m/2+1,axis=0)
cc = np.roll(cc, -n/2+1,axis=1)
return cc
- https://gist.github.com/thearn/5424195
- you must pad the filter to be the same size as image ( place it in the middle of a zeros_like mat.)
cheers, Dan
Related videos on Youtube
Allosteric
Updated on May 11, 2022Comments
-
Allosteric almost 2 years
I am studying image-processing using Numpy and facing a problem with filtering with convolution.
I would like to convolve a gray-scale image. (convolve a 2d Array with a smaller 2d Array)
Does anyone have an idea to refine my method ?
I know that scipy supports convolve2d but I want to make a convolve2d only by using Numpy.
What I have done
First, I made a 2d array the submatrices.
a = np.arange(25).reshape(5,5) # original matrix submatrices = np.array([ [a[:-2,:-2], a[:-2,1:-1], a[:-2,2:]], [a[1:-1,:-2], a[1:-1,1:-1], a[1:-1,2:]], [a[2:,:-2], a[2:,1:-1], a[2:,2:]]])
the submatrices seems complicated but what I am doing is shown in the following drawing.
Next, I multiplied each submatrices with a filter.
conv_filter = np.array([[0,-1,0],[-1,4,-1],[0,-1,0]]) multiplied_subs = np.einsum('ij,ijkl->ijkl',conv_filter,submatrices)
and summed them.
np.sum(np.sum(multiplied_subs, axis = -3), axis = -3) #array([[ 6, 7, 8], # [11, 12, 13], # [16, 17, 18]])
Thus this procudure can be called my convolve2d.
def my_convolve2d(a, conv_filter): submatrices = np.array([ [a[:-2,:-2], a[:-2,1:-1], a[:-2,2:]], [a[1:-1,:-2], a[1:-1,1:-1], a[1:-1,2:]], [a[2:,:-2], a[2:,1:-1], a[2:,2:]]]) multiplied_subs = np.einsum('ij,ijkl->ijkl',conv_filter,submatrices) return np.sum(np.sum(multiplied_subs, axis = -3), axis = -3)
However, I find this my_convolve2d troublesome for 3 reasons.
- Generation of the submatrices is too awkward that is difficult to read and can only be used when the filter is 3*3
- The size of the varient submatrices seems to be too big, since it is approximately 9 folds bigger than the original matrix.
- The summing seems a little non intuitive. Simply said, ugly.
Thank you for reading this far.
Kind of update. I wrote a conv3d for myself. I will leave this as a public domain.
def convolve3d(img, kernel): # calc the size of the array of submatracies sub_shape = tuple(np.subtract(img.shape, kernel.shape) + 1) # alias for the function strd = np.lib.stride_tricks.as_strided # make an array of submatracies submatrices = strd(img,kernel.shape + sub_shape,img.strides * 2) # sum the submatraces and kernel convolved_matrix = np.einsum('hij,hijklm->klm', kernel, submatrices) return convolved_matrix
-
Marijn van Vliet about 7 yearsthank you for providing drawings of the matrices :) If I understand correctly, you want tips on how to make your solution more elegant?
-
Allosteric about 7 yearsGlad it helps! Yes. I would be grateful if you can provide me tips to overcome the 3 problems written in the very last lines.
-
Allosteric about 7 yearsI should add that the 3 points are rather arranged in a priority order. The first one is quite important for me and the last one seems kinda trivial. I will also be glad if there are other problems and refinements about it.
-
Andreas K. over 5 yearsIsn't the second drawing (after the equality sign) wrong? Shouldn't each submatrix be multiplied (element-wise) with the filter, and then the elements of each of the resulting submatrices summed?
-
Allosteric over 5 years@AndyK They will produce the same result.
-
NaN about 7 yearsif a_s is the strided array and filter is your laplacian like filter, then try... np.sum(a_s*filter, axis=(2,3)) if indeed your answer is array([[ 6, 7, 8], [11, 12, 13], [16, 17, 18]])
-
NaN about 7 yearssimplify further... see my comment below... np.sum(a_s * filter, axis=(2,3)) if indeed your answer is array([[ 6, 7, 8], [11, 12, 13], [16, 17, 18]]) ... where a_s is strided array and filter is the 3x3 filter
-
Allosteric about 7 yearsThank you for the tip. I am trying this myself right now. Maybe trivial but I believe that the name filter is not appropriate because it is a built-in function of python.
-
Daniel F about 7 yearsNot sure why that works @NaN since it is certainly not doing what the problem asked - but it does, even for arbitrary
a
matrices -
Crispin about 7 yearsYou can do the summation directly in the Einstein summation. See answer
-
Allosteric about 7 yearsIn this question, isn't it better to write,
sub_shape = conv_filter.shape
? -
Daniel F about 7 yearsIt doesn't seem that it would necessarily be the case.
conv_filter.shape
should bea.shape - sub_shape + 1
. It just so happens in your examplea.shape = [5, 5]
andsub_shape = (3, 3)
soconv_filter.shape == sub_shape
is true but not strictly enforced. If you want it to be enforced, then you should setsub_shape = a.shape - conv_filter.shape + 1
-
Allosteric about 7 yearsGot it! So
view_shape = conv_filter.shape
? -
Daniel F about 7 yearswell
view_shape
should be a 4D tuple, soview.shape = tuple(conv_filter.shape) + tuple(sub_shape)
, so you still need to findsub_shape
-
Allosteric about 7 yearsat least at numpy v12, a.shape and f.shape is a tuple so
s
should betuple(np.subtract(a.shape, f.shape)+1)
, I think. -
Daniel F about 7 yearsTrue! Still used to
shape
giving anndarray
. Fixed that. -
michaelsnowden over 6 yearsFollow-up question here: stackoverflow.com/questions/45580013/…
-
sirgogo about 6 yearsHi @DanielF, is there a generalization of this that works for RGB? Not super familiar with einsum notation, so an idea how to generalize would be awesome.
-
Daniel F about 6 years@sirgogo, I suggest asking a seperate question linking this answer. There are people watching
numpy
way better at multidimensional convlcutions than I am. -
sirgogo almost 6 years@DanielF, thanks, here is the more general version of what I'm trying to do: [stackoverflow.com/questions/50239641/…
-
Foad S. Farimani over 5 yearsI have a similar issue for higher dimensions. I was wondering if you could be so kind to take a look at my question here
-
Cris Luengo about 4 yearsIt is certainly not efficient to use the FFT for convolution if the kernel has only 9 elements. This is only useful for large kernels.
-
Dan Erez about 4 yearsTrue. It depends on the kernel and image size but the threshold for fft to outperform is very low. It also depends on your computer architecture but for most generic use-cases fft is the go to in high performance libraries.
-
Naman Bansal over 3 yearsThe following article compares the performance of various image convolution methods, and thus, we can conclude that the FFT method of Numpy is the fastest method to perform convolution. laurentperrinet.github.io/sciblog/posts/…
-
Cris Luengo over 3 yearsThis code is not correct. It will shift the output image. The larger the kernel, the larger the shift.
-
Naman Bansal over 3 yearsCan you elaborate on how will it shift the output image?
-
Cris Luengo over 3 years
-
Michael Kopp over 2 yearsIs using
einsum("ij,klij", ...)
sufficient/better? Using the code from this answer with input of different shape, I gotoperands could not be broadcast together with remapped shapes
, using"ij,klij"
seems to work without issues. -
OuttaSpaceTime over 2 yearsis the kernel flipped here or is it without flipping?