Why is numpy.linalg.pinv() preferred over numpy.linalg.inv() for creating inverse of a matrix in linear regression
Solution 1
If the determinant of the matrix is zero it will not have an inverse and your inv function will not work. This usually happens if your matrix is singular.
But pinv will. This is because pinv returns the inverse of your matrix when it is available and the pseudo inverse when it isn't.
The different results of the functions are because of rounding errors in floating point arithmetic
You can read more about how pseudo inverse works here
Solution 2
inv
and pinv
are used to compute the (pseudo)-inverse as a standalone matrix. Not to actually use them in the computations.
For such linear system solutions the proper tool to use is numpy.linalg.lstsq
(or from scipy) if you have a non invertible coefficient matrix or numpy.linalg.solve
(or from scipy) for invertible matrices.
2Obe
Standing at the mystic data science forest's edge - watching the deep green trees staggering gently in the warm wind and smelling the sweet odor of an big adventure - I am truly grateful
Updated on July 09, 2022Comments
-
2Obe almost 2 years
If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with:
theta = inv(X^T * X) * X^T * y
one step is to calculate inv(X^T*X). Therefore numpy provides np.linalg.inv() and np.linalg.pinv()
Though this leads to different results:
X=np.matrix([[1,2104,5,1,45],[1,1416,3,2,40],[1,1534,3,2,30],[1,852,2,1,36]]) y=np.matrix([[460],[232],[315],[178]]) XT=X.T XTX=XT@X pinv=np.linalg.pinv(XTX) theta_pinv=(pinv@XT)@y print(theta_pinv) [[188.40031946] [ 0.3866255 ] [-56.13824955] [-92.9672536 ] [ -3.73781915]] inv=np.linalg.inv(XTX) theta_inv=(inv@XT)@y print(theta_inv) [[-648.7890625 ] [ 0.79418945] [-110.09375 ] [ -74.0703125 ] [ -3.69091797]]
The first output, that is the output of pinv is the correct one and additionally recommended in the numpy.linalg.pinv() docs. But why is this and where are the differences / Pros / Cons between inv() and pinv().
-
mavavilj about 5 yearsCan you explain why pinv and inv results are comparable?
-
Vedant Shetty about 5 yearsI don't see why it would not. Just because it is a different approach to get an answer doesn't mean it will be different.
-
mavavilj about 5 yearsYes, but can one also demonstrate that. Does it imply it in documentation, now it's just your opinion.
-
mavavilj about 5 yearsAre
inv
andpinv
"comparable" (i.e. do they produce closely the same result, when used as part of computation). -
Vedant Shetty about 5 yearsI really don't want to be unhelpful. But please check out the formula for inverse of a matrix and moore- Penrose inverse. It's actually a math concept.
-
Vedant Shetty about 5 yearsI really don't want to be unhelpful. But please check out the formula for inverse of a matrix and moore- Penrose inverse. It's actually a math concept.
-
alwaysmpe over 2 yearsMight be worth pointing out that, in the case that it's possible to calculate the matrix inverse, the (Moore-Penrose) pseudo-inverse is (mathematically) equivalent. This isn't an issue of code documentation, it's a mathematical construct (in the same way that a matrix inverse is a mathematical construct).