POSIT tutorial

Pose Estimation

JavierBarandiaran


In this tutorial we will see how to estimate the pose of a 3d object in a single image using the function cvPOSIT. This function implements the POSIT algorithm (DeMenthon & Davis 1995). Also we will make some tests and see the result of the algorithm using OpenGL.

The pose M of a 3d object is a combination of its orientation R (a 3d rotation matrix) and its position T (a 3d translation vector) respect to the camera. So the pose M = [ R | T ] is a 3x4 matrix.

Given some 3D points (object coordinates system) of the object, at least four non-coplanar points, and their corresponding 2D projections in the image, the algorithm is able to estimate the pose.

We will estimate the pose of a virtual cube. As the real pose of the cube is already known we can calculate the projections of the corners, then estimate the pose with POSIT and compare it with the real one.

Model Points

First of all, the posit object must be created with the model points, we will use the eight corners of the cube. The first point of the array passed to cvCreatePOSITObject must be ( 0, 0, 0 ). This point is known as the reference point of the object. POSIT returns the translation from the camera to this point.

float cubeSize = 10.0;
std::vector<CvPoint3D32f> modelPoints;
modelPoints.push_back(cvPoint3D32f(0.0f, 0.0f, 0.0f));
modelPoints.push_back(cvPoint3D32f(0.0f, 0.0f, cubeSize));
modelPoints.push_back(cvPoint3D32f(0.0f, cubeSize, cubeSize));
modelPoints.push_back(cvPoint3D32f(0.0f, cubeSize, 0.0f));
modelPoints.push_back(cvPoint3D32f(cubeSize, 0.0f, 0.0f));
modelPoints.push_back(cvPoint3D32f(cubeSize, cubeSize, 0.0f));
modelPoints.push_back(cvPoint3D32f(cubeSize, cubeSize, cubeSize));
modelPoints.push_back(cvPoint3D32f(cubeSize, 0.0f, cubeSize));
CvPOSITObject *positObject = cvCreatePOSITObject( &modelPoints[0], static_cast<int>(modelPoints.size()) );

Image Points

We must create an array with the corresponding 2d image points. The image points must be placed in the array in the same order as the model points. In other words, the first point of this array must correspond to the projection of the first model point. The origin of the coordinates of the image is situated at the middle.

For each model point, its coordinates in the camera space are calculated, i.e. they are tranformed by the real pose. Then the projection is calculated using the perspective model.

std::vector<CvPoint2D32f> imagePoints;
for ( size_t p=0; p<modelPoints.size(); ++p )
{
        CvPoint3D32f point3D;
        //Transform the 3D points with the real pose
        //apply the rotation
        point3D.x = poseReal[0] * modelPoints[p].x 
                        + poseReal[4]*modelPoints[p].y
                        + poseReal[8]*modelPoints[p].z;
        //add the translation
        point3D.x = point3D.x + poseReal[12];
        
        point3D.y = poseReal[1] * modelPoints[p].x 
                        + poseReal[5]*modelPoints[p].y
                        + poseReal[9]*modelPoints[p].z;
        point3D.y = point3D.y + poseReal[13];

        point3D.z = poseReal[2] * modelPoints[p].x 
                        + poseReal[6]*modelPoints[p].y
                        + poseReal[10]*modelPoints[p].z;
        point3D.z = point3D.z + poseReal[14];

        //Project the transformed 3D points
        CvPoint2D32f point2D;
        //The central point is not add because POSIT needs the image point coordinates related to the middle point of the image
        point2D.x = focalLength * point3D.x / (-point3D.z); //z negative
        point2D.y = focalLength * point3D.y / (-point3D.z); 
        imagePoints.push_back( point2D );
}

The real pose is float[16] array representing a 4x4 matrix in OpenGL format (column-major order).

The rotation matrix is:

poseReal[0]

poseReal[4]

poseReal[8]

poseReal[1]

poseReal[5]

poseReal[9]

poseReal[2]

poseReal[6]

poseReal[10]

and the translation vector is:

poseReal[12]

poseReal[13]

poseReal[14]

Pose Estimation

Now that we have the model and image points we can compute the pose:

CvMatr32f rotation_matrix = new float[9];
CvVect32f translation_vector = new float[3];
//set posit termination criteria: 100 max iterations, convergence epsilon 1.0e-5
CvTermCriteria criteria = cvTermCriteria(CV_TERMCRIT_EPS, 100, 1.0e-5 );
cvPOSIT( positObject, &imagePoints[0], FOCAL_LENGTH, criteria, rotation_matrix, translation_vector );   
createOpenGLMatrixFrom( rotation_matrix, translation_vector);

OpenGL

In order to draw the model using OpenGL we must build the modelView (pose) matrix and the projection matrix.

OpenGl ModelView Matrix

for (int f=0; f<3; f++)
{
        for (int c=0; c<3; c++)
        {
                posePOSIT[c*4+f] = rotation_matrix[f*3+c];      //transposed
        }
}
posePOSIT[3] = 0.0;
posePOSIT[7] = 0.0;     
posePOSIT[11] = 0.0;
posePOSIT[12] =  translation_vector[0];
posePOSIT[13] =  translation_vector[1]; 
posePOSIT[14] = -translation_vector[2]; //negative
posePOSIT[15] = 1.0; //homogeneous

OpenGl Projection Matrix

This is a standard perspective projection matrix built with the intrinsic parameters(focalLength(X,Y),principal point(pX,pY)), image resolution(iamgeWidth,imageHeight) and far and near plane values.

Remember that is must be stored in column-major order.

2.0 * focalX / imageWidth

0

2.0 * ( pX / imageWidth ) - 1.0

0

0

2.0 * focalY / imageHeight

2.0 * ( pY / imageHeight ) - 1.0

0

0

0

-( farPlane+nearPlane ) / ( farPlane - nearPlane )

-2.0 * farPlane * nearPlane / ( farPlane - nearPlane )

0

0

-1

0

Drawing the Model

glClear( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );
glViewport(0, 0, imageWidth, imageHeight );
glMatrixMode( GL_PROJECTION );
glLoadMatrixd( projectionMatrix );      
glMatrixMode( GL_MODELVIEW );
glLoadMatrixd( posePOSIT );
drawModel();

References

The algorithm is described in

D. DeMenthon and L.S. Davis, "Model-Based Object Pose in 25 Lines of Code", International Journal of Computer Vision, 15, pp. 123-141, June 1995. (see http://www.cfar.umd.edu/~daniel/)

Code

The code doesn't work correctly, I don't know where is the problem. Please help!

codeTut.rar

As you can see the model (red) is not correctly projected

Posit (last edited 2007-07-15 19:46:58 by JavierBarandiaran)

SourceForge.net Logo