Why is numpy.linalg.pinv() preferred over numpy.linalg.inv() for creating inverse of a matrix in linear regression

28,503

Solution 1

If the determinant of the matrix is zero it will not have an inverse and your inv function will not work. This usually happens if your matrix is singular.

But pinv will. This is because pinv returns the inverse of your matrix when it is available and the pseudo inverse when it isn't.

The different results of the functions are because of rounding errors in floating point arithmetic

You can read more about how pseudo inverse works here

Solution 2

inv and pinv are used to compute the (pseudo)-inverse as a standalone matrix. Not to actually use them in the computations.

For such linear system solutions the proper tool to use is numpy.linalg.lstsq (or from scipy) if you have a non invertible coefficient matrix or numpy.linalg.solve (or from scipy) for invertible matrices.

Share:
28,503
2Obe
Author by

2Obe

Standing at the mystic data science forest's edge - watching the deep green trees staggering gently in the warm wind and smelling the sweet odor of an big adventure - I am truly grateful

Updated on July 09, 2022

Comments

  • 2Obe
    2Obe almost 2 years

    If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with:

    theta = inv(X^T * X) * X^T * y

    one step is to calculate inv(X^T*X). Therefore numpy provides np.linalg.inv() and np.linalg.pinv()

    Though this leads to different results:

    X=np.matrix([[1,2104,5,1,45],[1,1416,3,2,40],[1,1534,3,2,30],[1,852,2,1,36]])
    y=np.matrix([[460],[232],[315],[178]])
    
    XT=X.T
    XTX=XT@X
    
    pinv=np.linalg.pinv(XTX)
    theta_pinv=(pinv@XT)@y
    print(theta_pinv)
    
    [[188.40031946]
     [  0.3866255 ]
     [-56.13824955]
     [-92.9672536 ]
     [ -3.73781915]]
    
    inv=np.linalg.inv(XTX)
    theta_inv=(inv@XT)@y
    print(theta_inv)
    
    [[-648.7890625 ]
     [   0.79418945]
     [-110.09375   ]
     [ -74.0703125 ]
     [  -3.69091797]]
    

    The first output, that is the output of pinv is the correct one and additionally recommended in the numpy.linalg.pinv() docs. But why is this and where are the differences / Pros / Cons between inv() and pinv().

  • mavavilj
    mavavilj about 5 years
    Can you explain why pinv and inv results are comparable?
  • Vedant Shetty
    Vedant Shetty about 5 years
    I don't see why it would not. Just because it is a different approach to get an answer doesn't mean it will be different.
  • mavavilj
    mavavilj about 5 years
    Yes, but can one also demonstrate that. Does it imply it in documentation, now it's just your opinion.
  • mavavilj
    mavavilj about 5 years
    Are inv and pinv "comparable" (i.e. do they produce closely the same result, when used as part of computation).
  • Vedant Shetty
    Vedant Shetty about 5 years
    I really don't want to be unhelpful. But please check out the formula for inverse of a matrix and moore- Penrose inverse. It's actually a math concept.
  • Vedant Shetty
    Vedant Shetty about 5 years
    I really don't want to be unhelpful. But please check out the formula for inverse of a matrix and moore- Penrose inverse. It's actually a math concept.
  • alwaysmpe
    alwaysmpe over 2 years
    Might be worth pointing out that, in the case that it's possible to calculate the matrix inverse, the (Moore-Penrose) pseudo-inverse is (mathematically) equivalent. This isn't an issue of code documentation, it's a mathematical construct (in the same way that a matrix inverse is a mathematical construct).