The difference of pseudo-inverse between SciPy and Numpy
I can't speak as to why there are implementations in both scipy and numpy, but I can explain why the behaviour is different.
numpy.linalg.pinv
approximates the Moore-Penrose psuedo inverse using an SVD (the lapack method dgesdd
to be precise), whereas scipy.linalg.pinv
solves a model linear system in the least squares sense to approximate the pseudo inverse (using dgelss
). This is why their performance is different. I would expect the overall accuracy of the resulting pseudo inverse estimates to be somewhat different as well.
You might find that scipy.linalg.pinv2
performs more similarly to numpy.linalg.pinv
, as it too uses an SVD method, rather than a least sqaures approximation.
Related videos on Youtube
Hanfei Sun
Just another stackoverflow user cs.cmu.edu/~hanfeis
Updated on September 15, 2022Comments
-
Hanfei Sun over 1 year
I found that there're two versions of
pinv()
function, which calculates the pseudo-inverse of a matrix inScipy
andnumpy
, the documents can be viewed at:http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.pinv.html
http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.pinv.html
The problem is that I have a 50000*5000 matrix, when using
scipy.linalg.pinv
, it costs me more than 20GB of memory. But when I usenumpy.linalg.pinv
, only less than 1GB of memory is used..I was wondering why
numpy
andscipy
both have apinv
under different implemention. And why their performances are so different. -
talonmies over 11 years"better" is a very subjective term. Only you know what you need the pseudo inverse for in the first place. Presumably you also have criteria about the performance and numerical stability of your algorithms. Whichever one is "better" is the one which best satisfies your criteria.