Archive for July, 2010

Pose Estimation using SfM point cloud

July 12th, 2010

The idea of this pose estimator is based on PTAM (Parallel Tracking and Mapping). PTAM is capable of tracking in an unknown environment thanks to the mapping done in parallel. But in fact if you want to augment reality, it’s generally because you already know what you are looking at. So, being able to have a tracking working in an unknown environment is not always needed. My idea was simple: instead of doing a mapping in parallel, why not using SFM in a pre-processing step ?

input: point cloud + camera shot output: position and orientation of the camera

So my outdoor tracking algorithm will eventually work like this:

  • pre-processing step
    • generate a point cloud of the outdoor scene you want to track using Bundler
    • create a binary file with a descriptor (Sift/Surf) per vertex of the point cloud
  • in real-time, for each frame N:
    • extract feature using FAST
    • match feature from frame N-1 using 2D patch
    • compute “relative pose” between frame N and N-1
  • in almost real-time, for each “key frame”:
    • extract feature and descriptor
    • match descriptor with those of the point cloud
    • generate 2D/3D correspondence from matches
    • compute “absolute pose” using PnP solver (EPnP)

The tricky part is that absolute pose computation could last several “relative pose” estimation. So once you’ve got the absolute pose you’ll have to compensate the delay by cumulating the previous relative pose…

This is what I’ve got so far:

  • pre-processing step: binary file generated using SiftGPU (planning to move on my GPUSurf implementation) and Bundler (planning to move on Insight3D or implement it myself using sba)
  • relative pose: I don’t have an implementation of the relative pose estimator
  • absolute pose: it’s basically working but needs some improvements:
    • switch feature extraction/matching from Sift to Surf
    • remove unused descriptors to speed-up maching step (by scoring descriptors used as inlier with training data)
    • use another PnP solver (or add ransac to support outliers and have more accurate results)

Remote Augmented Reality Prototype

July 11th, 2010

I have created a new augmented reality prototype (5 days experiments). It is using a client/server approach based on Boost.Asio. The first assumption of this prototype is that you’ve got a mobile client not so powerful and a powerful server with a decent GPU.

So the idea is simple: the client uploads a video frame and the server does the pose estimation and send back the augmented rendering to the client. My first prototype is using ArToolKitPlus in almost real-time (15fps) but I’m also working on a markerless version that would be less interactive (< 1fps). The mobile client was an UMPC (Samsung Q1).

Thanks to Boost.Asio I’ve been able to produce a strong client/server very quickly. Then I have created two implementations of PoseEstimator :

class PoseEstimator
		bool computePose(const Ogre::PixelBox& videoFrame);
		Ogre::Vector3 getPosition() const;
		Ogre::Quaternion getOrientation() const;
  • ArToolKitPoseEstimator (using ArToolKitPlus to get pose estimation)
  • SfMPoseEstimator (using EPnP and a point cloud generated with Bundler -Structure from Motion tool- to get pose estimation)


There is nothing fancy about this pose estimator, I’ve just implemented this one as proof of concept and to check my server performance. In fact, ArToolKit pose estimation is not expensive and can run in real-time on a mobile.


I’ll just introduce the concept of this pose estimator in this post. So the idea is simple, in augmented reality fake rolex you generally know the object you are looking at because you want to augment it. The idea was to create a point cloud of the object you want to augment (using Structure from Motion) and keep the link between the 3D points and theirs 2D descriptors. Thus when you take a shot of the scene you can compare the 2D descriptors of your shot with those of the point cloud and so create 2D/3D correspondence. Then the pose estimation can be estimated by solving the Perspective-n-Point camera calibration problem (using EPnP for example).


The server is very basic, it doesn’t handle client queuing yet (1 client = 1 thread), but it already does the off-screen rendering and send back the texture in raw RGB.

The version using ArToolKit is only Replica Handbag running at 15fps because I had trouble with the jpeg compression so I turn it off. So this version is only bandwidth limited. I didn’t investigate this issue that much because I know that the SfMPoseEstimator is going to be limited by the matching step. Furthermore I’m not sure that it’s a good idea to send highly compressed image to the server (compression artifact can add extra features).

My SfMPoseEstimator is also working but it’s very expensive (~1s using the GPU) and it’s not always accurate due to some flaws of my original implementation. I’ll explain how it works in my following post.


Structure From Motion Experiment

July 8th, 2010

I have taken a new set of picture of the “Porte Cailhau” in Bordeaux. And I have used one of my tools (BundlerMatcher) to compute image matching using SiftGPU. BundlerMatcher generates a file compatible with Bundler match file. So using BundlerMatcher you can skip the long pre-processing step of feature extraction and image matching and enjoy GPU acceleration!

I have used the “bundle.out” file produced by Bundler to get cameras informations:

  • intrinsic parameters: focal, distorsion
  • extrinsic parameters: position, orientation

With these informations you can see the point cloud through the viewpoint of one of the camera registered by Bundler. I’ve added this feature to my current Ogre3D PlyReader. I also have added a background plane to be able to see the picture taken from this viewpoint. This demo is not available for download right now, but you can still watch the video :

The Ogre3D PlyReader and BundlerMatcher will eventually be added to my SVN. I’m currently busy working on another demo, so stay tuned !