How to calculate Rotation and Translation matrices from homography?

15,341

Solution 1

Homography only works for planar scenes (ie: all of your points are coplanar). If that is the case then the homography is a projective transformation and it can be decomposed into its components.

But if your scene isn't coplanar (which I think is the case from your description) then it's going to take a bit more work. Instead of a homography you need to calculate the fundamental matrix (which emgucv will do for you). The fundamental matrix is a combination of the camera intrinsic matrix (K), the relative rotation (R) and translation (t) between the two views. Recovering the rotation and translation is pretty straight forward if you know K. It looks like emgucv has methods for camera calibration. I am not familiar with their particular method but these generally involve taking several images of a scene with know geometry.

Solution 2

To figure out camera motion (exact rotation and translation up to a scaling factor) you need

  • Calculate fundamental matrix F, for example, using eight-point algorithm
  • Calculate Essential matrix E = A’FA, where A is intrinsic camera matrix
  • Decompose E which is by definition Tx * R via SVD into E=ULV’
  • Create a special 3x3 matrix

        0 -1  0   
    W = 1  0  0      
        0  0  1  
    

that helps to run decomposition:

R = UW-1VT, Tx = ULWUT, where

      0  -tx  ty
Tx =  tz  0   -tx
     -ty  tx   0 
  • Since E can have an arbitrary sign and W can be replace by Winv we have 4 distinct solution and have to select the one which produces most points in front of the camera.

Solution 3

It's been a while since you asked this question. By now, there are some good references on this problem.

One of them is "invitation to 3D image" by Ma, chapter 5 of it is free here http://vision.ucla.edu//MASKS/chapters.html

Also, Vision Toolbox of Peter Corke includes the tools to perform this. However, he does not explain much math of the decomposition

Share:
15,341
mili
Author by

mili

I am a graduate of Computer Science And Technology at Uva Wellassa University of Sri Lanka. Currently I am working as a freelancer on UpWork (odesk). Anyway I am coding not just for my job, it is fun making a code :)

Updated on June 05, 2022

Comments

  • mili
    mili almost 2 years

    I have already done the comparison of 2 images of same scene which are taken by one camera with different view angles(say left and right) using SURF in emgucv (C#). And it gave me a 3x3 homography matrix for 2D transformation. But now I want to make those 2 images in 3D environment (using DirectX). To do that I need to calculate relative location and orientation of 2nd image(right) to the 1st image(left) in 3D form. How can I calculate Rotation and Translate matrices for 2nd image?

    I need also z value for 2nd image.

    I read something called 'Homograhy decomposition'. Is it the way?

    Is there anybody who familiar with homography decomposition and is there any algorithm which it implement?

    Thanks in advance for any help.

  • mili
    mili about 12 years
    Thanks jlewis42 for pay attention on this matter.
  • mili
    mili about 12 years
    But I calculate fundamental matrix as u said(using random generate points and project it using homography) and also I calculate the camera intrinsic matrix using chess board method of EmguCV. But I can not find any method to get R and T directly from fundamental matrix. After that I calculate the essential matrix as describe in here and get the R and T as describe in here so It did not give me an acceptable answer. Where could be the error?
  • jlewis42
    jlewis42 about 12 years
    Unfortunately I can't answer that without knowing how the result is wrong. Here's a couple of things to look into: Are you sure all of your point correspondences are valid? Check whether the fundamental matrix is correct using epipolar geometry (en.wikipedia.org/wiki/Epipolar_geometry). Basically if you multiply a point in the left image by the fundamental matrix it will give you the equation of a line in the right image (in ax + by + c = 0 form). The corresponding point in the right image will lie on that line. Also try recombining K, R and t to see if you get back the same F
  • mili
    mili about 12 years
    Thanks jlewis42 I will take a look at these things and trying to fix the errors. again thanks.
  • Utkarsh Sinha
    Utkarsh Sinha almost 12 years
    @jlewis42: Can you provide some details on how to decompose a projective transformation (when all points are coplanar).
  • Vlad
    Vlad about 10 years
    Homography also works for images of arbitrary 3D scenes for pure camera rotation or zoom. Of course pure camera rotation will never give ANY 3D information so there will be no way to estimate Z. Also Fundamental matrix provides a translation vector only up to a scaling factor. Overall, the problem should be restated more clearly. If you want to decompose a fundamental matrix or homographs, what is your scene and what do you want to measure. I will give you decomposition of F in the answer below, in case you need this.
  • mili
    mili about 10 years
    Thank you for the references
  • fxtentacle
    fxtentacle almost 6 years
    I believe the Tx matrix is a cross product as matrix multiplication, and hence needs to have -tz and not -tx in the 1st row 2nd column.