Hide menu

Evaluation metrics for the object tracking project

Each group should produce code for automatic evaluation of the system performance on the CAVIAR datasets.
The following 4 evaluation metrics should be implemented:

  • Precision: Defined as sum(TP)/(sum(TP) + sum(FP)).
  • Recall: Defined as sum(TP)/(sum(TP) + sum(FN)).
  • Average TP overlap: Computed only over the true positives (with correct ID). Intersection-over-union is computed as described in the lecture.
  • Identity switches: an identity switch happens when the identity of the detection associated with a ground truth bounding box changes.
These definitions apply:
  • True Positive (TP): A detection that has at least 20% overlap with the associated ground truth bounding box.
  • False Positive (FP): A detection that has less than 20% overlap with the associated ground truth bounding box, or that has no associated ground truth bounding box.
  • False Negative (FN): A ground truth bounding box that has no associated detection, or for which the associated detection overlap by less than 20%
These scores are computed over a whole sequence, such that sum(TP), sum(FP) and sum(FN) are the total counts over the sequence.

Note that each ground truth bounding box can have at most one associated object ID for each frame. When the object ID changes, an identity switch takes place.

Sequences for evaluation

Each group is expected to test their system extensively over the suggested datasets. We will also ask you to report your scores on a benchmark. Earlier years we used three sequences from the CAVIAR dataset as benchmark.

As benchmark, we will use the the three static-camera sequences from MOT17. They can be found in the liu gitlab here: git@gitlab.liu.se:cvl/tsbb15.git (in the folder sequences/) or at the MOT website.

Note: These sequences have fairly high resolution, so you may want to downsample them initially. If you do so, make sure to up-sample the output-bounding box coordinates to undo your downsampling (it is very easy to introduce a small shift here if you are not careful).

External evaluation from CSV-files

Furthermore each group is expected to produce, for the benchmark sequences, a simple file containing the output of their system. Use the .csv file format, where the file consists of a series of line entries where each line is structured in the following way:

framenumber, objectID, ul_x, ul_y, width, height

Here framenumber starts at zero, ul_x and ul_y are the x and y coordinates of the upper-left corner of the object bounding box. The origin of the coordinate system is (0,0) in the top-left pixel of the image, and the x and y axes are right-pointing, and down-pointing respectively.

Create a folder in the root of your repository, named Evaluation. In this folder, add a .csv file for each evaluation sequence. Use the naming convention <sequence_name>.csv, e.g. 02.csv for sequence 02 in the automatic evaluation. When you have created the .csv-files the first time, send an email to the examiner with a link to your repository.

Automatic evaluation results


Last updated: 2022-03-16