Cheap and Secure Web Hosting Provider : See Now

# Fundamental and essential matrices, value of intrinsic parameters

, ,
Problem Detail:

I observe a scene with two cameras, c1 and c2, that produce two images i1 and i2, respectively. What I ultimately want to do is to use information of image i1 and image2 simultaneously, e.g., for object detection.

As far as I understand it, I would not be required to find the intrinsic parameters of c1 and c2, also I do not need to know their relative position and rotation if I want to rectify i1 and i2. I can "simply" calculate F from a set of corresponding points - with no explicit relation in 3D of the points (like known distances). Which disadvantages would that have for me? I would not be able to do 3D reconstruction then, correct? But apart from that, there is no difference?

Or, asked differently, for rectification I do not need the intrinsic parameters. I only need them to do 3D reconstruction? What are the intrinsic parameters good for?

What are the intrinsic parameters good for? They potentially allow mapping objects in the images to locations in the real world (in 3D). For some applications, that can be very useful. For instance, consider pedestrian detection, where we want to detect pedestrians. Usually, it's not enough to detect the presence of a pedestrian: we also want to know exactly where the person is, and how far away they are.

Calculating alignment from a set of corresponding points is not an unreasonable approach to aligning two images. The primary disadvantage is that you need a way to identify a set of corresponding points. This typically requires either (a) explicit calibration or (b) heuristic methods.

Explicit calibration is a perfectly fine approach, if it is feasible for your application. However, for some applications we only have images post-facto and we don't have any way to do an explicit calibration, so it's not always an option.

Heuristic methods are another possibility. They try to find corresponding points, e.g., using Lucas-Kanade, optical flow, or other methods. As heuristics, they are not guaranteed to work. Sometimes they might fail to detect corresponding points, or output false detections. Another issue is that their effectiveness depends upon characteristics of the scene that you're capturing: they're more effective for some kinds of scenes than others. But this also could be a valid approach.

To be clear: 3D reconstruction requires two cameras (as mentioned in the question) or depth information; just one 2D camera plus its intrinsics are not enough to reconstruct where each object is in the real world, in 3D space.