Guide for the Impatient
If you just want to use the tracker you still need to know two important facts:
- The tracker has an internal state, i.e. it holds pointers to the last images. So do not remove or overwrite images or gradient images before tracking is finished! It also holds an internal list of tracked points. It is hence not necessary to call a point manipulation function in every cycle.
- You need to use a CornerDetector e.g. KLT for initial corner detection
- The function Tracker::Track() is always called before any point manipulation functions.
Here is some pseudo code:
CornerDetectorKLT<StorageType, CalculationType> cd;
Tracker<StorageType, CalculationType> tr(TT_Simple);
Vector<int> hws; hws.newsize(5);
hws[4] =hws[3] = hws[2] =hws[1] =hws[0] = 10;
tr.SetHalfWinSize(hws);
tr.SetRejectionType(-1);
tr.SetAffineBrightnessInvariance(true);
PyramidImage<CalculationType> pim[2], gx[2], gy[2];
int act=0;
vector<HomgPoint2D> mp;
vector<QUAL> qual;
tr.PreparePyramide(im, pyim[act], gx[act], gy[act]);
cd.Detect(*(gx[act])[0], *(gy[act])[0], mp, qual);
cout << "initially found "<<mp.size()<<" points\n";
tr.ReplaceAllPoints(mp, qual);
for every image do:
1. tr.PreparePyramide(im, pyim[act], gx[act], gy[act]);
2. tr.Track(pyim[act], gx[act], gy[act]);
3. If first image then tr.ReplaceAllPoints(mp, qual);
If points need to be filled up then tr.FillUpPoints(mp, qual);
4. tr.GetPoints(mp);
5. act=!act;
Point Manipulation Functions
A number of functions for manipulation of the internal data structures exist. The most commenly used beeing
- AddPoints() to make the tracker initialy aware of points to track, and
- FillUpPoints() to replace thge points for which tracking was not successfull.
Tracker Output
A number of results can be accesed which can be used to asses the correspondences:
- GetNumIter() : Number of iterations used in the biggest pyramid level.
- GetError() : Displacement from the last iteration in the biggest pyramid level.
- GetResiduiMAD() : Mean absolute grey value difference (per pixel) between the image point and the newly found correspondence.
- GetResiduiMSD() : Mean squared grey value difference (per pixel) between the image point and the newly found correspondence.
- GetResults() : Integer indication success (=0) or reason of failure (<0) of the tracking process.
- 0 : Success
- -1 : Structure tensor could not be inverted, usually resulting from an attempt to track an homogenous image patch.
- -2 : Point slid out of image.
- -3 : Maximum number of iterations has been reached and the displacement calculated in any iteration step never dropped below MaxError
- -4 : An input point lies outside of ROI.
- -5 : The mean absolute grey value difference between the point and the newly found correspondence is above MaxResiduumMAD.
- -6 : An input point lies at infinity.
- -7 : The X84/X84M outlier rejection rule classified this point as an outlier.
Tracker Parameter
- TrackerType : The tracking algorithm used. Currently only TT_Simple and TT_Weighted are supported. TT_Simple is better tested.
- RejectionType : The outlier rejection which is used after the tracking. Currently only RT_MAD and RT_X84/X84M are supported.
- GradientType : The gradient class which is used to calculate the image gradients in PreparePyramide(). See AddGradientType() in FilterBase.hh for details.
- LowPassType : The low pass class which is used to filter the images in PreparePyramide(). See AddLowPassType() in FilterBase.hh for details.
- PyIndex : Tracking is done starting with the smallest image in pyramid succesive using bigger images up to PyIndex. If PyIndex==0 then all images in the pyramid are used. Not very well tested.
- HalfWinSize : Vector determining the tracking window size for every pyramid level. The first entry corresponds to the biggest image. If the length of the vector is bigger than the size of the pyramid, than only the first entries are used.
- MaxError : Vector containing the error boundary at which the iteration is aborted for each pyramid level. The iteration is aborted when the displacement resulting from the linearizartion at the current iteration step falls below MaxError.
- MaxIter : Vector containing the maximum number of allowed iterations for each pyramid level. The iteration is aborted if MaxIter iterations are reached.
- MinBorderDist : The minimum distance of points from the image border in the biggest pyramid level. It is only used in feature manipulation functions. If set smartly, it can save uneccesary computations of image points which cannot be tracked due to the border problems in the smaller pyramid images.
- MaxFeatures : Maximum number of features which are tracked. This parameter detrmines the size of the internal memory.
Class Hirarchy
Two niveaus of tracking classes exist:
- The base classes just compute correspondences between two images without internal state and pyramid image usage.
- The higher level class Tracker uses an instance of a base class to compute correspondences. It has an internal state and makes use of pyramid images.
TrackerBase TrackerBase
The TrackerBase computes image point correspondences between two images. It does not have an internal state and does not use pyramid images. Derived from TrackerBaseInterface are TrackerBaseSimple (the standard tracker), TrackerBaseWeighted (decreasing weight in outer patch regions) and TrackerBaseAffine (affine or similar warp of the whole patch). It can handle different outlier rejection algorithms based on parametrization:
- Rejection:
- RT_MAD: A simple outlier rejection algorithm based on the mean absolute grey value difference between the two images in the tracking window. A point correspondence is rejected if the MAD residuum is above a user given (parameter) threshold.
- RT_X84: The X84 outlier rejection rule as suggested by Tommansini, Fusiello and Roberto "Robust Feature Tracking", CVPR 1998. It basically calculates the median of the mean squared grey value differences (MSD) (this is similar to calculating the mean residuum) and the median difference of the MSD value from (this is similar to calculating the variance or the standard deviation). Point correspondences are rejected if their MSD value is above . In contrast to the implementation of Tommasini, the measn squared residui are used instead of the sum of squared grey value differences.
- RT_X84M: Same as RT_X84, but all features that have a sum of absolute grey value differences <= MAD residuum are accepted automatically!
Tracker Class
This is a convenient class for the tracking functionality. It can be used very easily, even though two footangles exist:
- The tracker has an internal state, i.e. it holds pointers to the last images. So do not remove or overwrite images or gradient images before tracking is finished! It also holds an internal list of tracked points. It is hence not necessary to call a point manipulation function in every cycle.
- The function Tracker::Track() is always called before any point manipulation functions.
See the examples in the subdirectory Matcher2D/Examples for how to use the tracker.
Generic Image Alignment
The tracker implements gradient-based image alignment according to an additive model, i.e. a parameter prediction is "tested" in an analysis-by-synthesis fashion and the remaining grey value difference is exploited to compute an additive update onto the parameter vector. For more complicated warps than displacement (e.g. affine) it is much more efficient to use inverse compositional alignment to track a patch trough a sequence. The class BIAS::ImageAlignment is a generic implementation to compute arbitrary (groupwise) warps based on image differences, e.g. a homography or affine transform.
Instead of using a pyramid, it is possible to provide an uncertainty of the predicted (homography, displacement, ...) parameters, such that automatic smoothing is done. This is particularly interesting e.g. for rotational warps. The generic alignment is only a reference implementation not optimized for speed, for fast affine tracking through an image sequence a specialized implementation should be considered.