The difference of pseudo-inverse between SciPy and Numpy

matrix numpy scipy

10,479

I can't speak as to why there are implementations in both scipy and numpy, but I can explain why the behaviour is different.

numpy.linalg.pinv approximates the Moore-Penrose psuedo inverse using an SVD (the lapack method dgesdd to be precise), whereas scipy.linalg.pinv solves a model linear system in the least squares sense to approximate the pseudo inverse (using dgelss). This is why their performance is different. I would expect the overall accuracy of the resulting pseudo inverse estimates to be somewhat different as well.

You might find that scipy.linalg.pinv2 performs more similarly to numpy.linalg.pinv, as it too uses an SVD method, rather than a least sqaures approximation.

10,479

Hanfei Sun

Just another stackoverflow user cs.cmu.edu/~hanfeis

Updated on September 15, 2022

Comments

Hanfei Sun over 1 year

I found that there're two versions of pinv() function, which calculates the pseudo-inverse of a matrix in Scipy and numpy, the documents can be viewed at:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.pinv.html

http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.pinv.html

The problem is that I have a 50000*5000 matrix, when using scipy.linalg.pinv, it costs me more than 20GB of memory. But when I use numpy.linalg.pinv, only less than 1GB of memory is used..

I was wondering why numpy and scipy both have a pinv under different implemention. And why their performances are so different.
talonmies over 11 years

"better" is a very subjective term. Only you know what you need the pseudo inverse for in the first place. Presumably you also have criteria about the performance and numerical stability of your algorithms. Whichever one is "better" is the one which best satisfies your criteria.