Introduction

Goal:

Our goal is to develop an
efficient method for updating the 3D model of a moving object. In practice,
scenes seldom change abruptly except in few situations in which lighting
conditions suddenly change. The background of a scene is largely static and the
foreground changes slightly in subsequent frames. Typically, the whole scene
changes slowly over a long sequence of frames. We exploit this observation to
reduce the reconstruction time.

We do not perform full
reconstruction from every corresponding set of frames. Instead the previously
constructed model is updated with current 3D information. Then we find the
difference between the previous and current set of frames which exhibit
intensity change are the ones which affect the current temporal 3D model.
Therefore, we use only this small subset of point in estimating the changes in
the new model.

Methodology:

A combination of feature based
method and a direct based method is used for initial reconstruction of the 3D
model. The direct based method gives the optical flow from a pair of images in
a sequence. This information is used to determine if there is a difference in
the displacemennt of feature points between the two images. If there is a big
difference, then the second image is used in the feature based method to
construct the 3D model. Otherwise, the seccond frame is ignored.

Structure From Motion

We adopt structure from motion
algorithm to recover 3D model from relatively long image sequences. The image
sequences are generated by a video stream captured by moving the object of
interest in front a desktop camera. Structure-from-stereo or the stereovision
approach makes use of projective invariants from two (or three) images from
different viewpoint.

Feature-Based Method

**KLT Tracker: **

Feature correspondences are
feature locations on different images. Using the feature correspondences,
epipolar constraints under projective invariant can be written over the image
sequence to give a system of equations. Solving this system of equation
recovers camera motion and structure simultaneously.

The accuracy of such
feature-based algorithms would be dependent on the accuracy of feature tracker.
We used the robust and reliable Kanade-Lacas-Kanade (KLT) tracker- Method for
identification and matching features from frame to frame in the long sequences
of images.

**Factorization Method: **

The output of the KLT Tracker
are the input of the Tomasi and Kanade proposed factorization method for the
recovery of shape and motion from a sequence of images. It achieves its
accuracy and robustness by applying the well understood numerical computation
Single Value Decomposition or SVD to a large number of images and feature
points.

Camera projection used is the
orthographic model:

Direct-Based Method

**Optical Flow: **

Optical flow refers to the
apparent motion of the image intensity or brightness pattern over time. The
motion field can be though as the projection of the 3D velocity field on the
image field.

Updating the 3D model
(Contribution)

In this paper, we propose an
efficient approach that exploits the similarities between subsequent frame
sequences. Since there are a lot of similarities between the subsequent fame
sequences, the 3D reconstruction from the subsequent set of frames is almost
identical to the previous reconstruction.

Heuristically, we assume that
most of the estimated 3D points in the new model will be the same. We assume
that only few of them will change in the model. So the previous model is
carried over to the current time.

Only 2D points exhibiting
intensity change in the new set of frames are used to amend the new 3D model.

We extend the 2-stage algorithm
discusses in section II to update the depth information during the subsequent
modeling rather than employing the full reconstruction method all over again.
This method gives reconstructs the 3D model efficiently for each timeframe
without compromising the quality of the model.

**Stage 1: Direct-based
Method **

After selecting the feature
points in the first frame, we start a loop. Each time we compute the optical
flow for the consecutive image pairs.

For the video stream sequence at
time t1, compute the optical flow for the first frame F1 of the video with the
second frame F2 of the sequence, resulting in the optical flow matrix let’s say
OF.

We compare each entry of the
matrix OF with a threshold. Then we count the number of feature points that
have a difference in optical flow larger than the threshold. If this number is
more than the third of the feature points, then the frame must be taken into
account in the factorization method. Otherwise, the frame is simply ignored.

**Stage 2: Feature-based
Method**

We assume that the number of
features extracted and number of frames considered for feature correspondences
remain the same for all the iterations.

The factorization method is applied
to the frames selected using the optical flow discussed in stage 1.

Results Part 2