Tracking-by-regression is a new paradigm for online Multi-Object Tracking (MOT). It unifies detection and tracking into a single network by associating targets through regression, significantly reducing the complexity of data association. However, owing to noisy features from nearby occlusions and distractors, the regression is vulnerable and unaware of the inter-object occlusions and intra-class distractors. Thus the regressed bounding boxes can be wrongly suppressed or easily drift. Meanwhile, the commonly used bounding box-based post-processing is unable to remedy false negatives and false assignments caused by regression. To address these challenges, we present to leverage regression tubes as input for the regression-based tracker, which provides spatial–temporal information to enhance the tracking performance. Specially, we propose a novel tube re-localization strategy that obtains robust regressions and recovers missed targets. A tube-based NMS (T-NMS) strategy to manage the regressions at the tube level is also proposed, including a tube IoU (T-IoU) scheme for measuring positional relation and tube re-scoring (T-RS) to evaluate the quality of candidate tubes. Finally, a tube re-assignment strategy is further employed for robust cost measurement and to revise false assignments using motion cues. We evaluate our method on benchmarks, including MOT16, MOT17, and MOT20. The results show that our method can significantly improve the baseline, mitigate the challenges of the regression-based tracker, and achieve very competitive tracking performance.

Original languageEnglish
Article number103586
JournalComputer Vision and Image Understanding
Early online date15 Nov 2022
Publication statusPublished - 31 Jan 2023

Bibliographical note

No research funders listed.
Data availability
No data was used for the research described in the article.


  • Multi-object tracking
  • Regression tubes
  • Spatial-temporal information
  • Tracking-by-regression

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition


Dive into the research topics of 'Multi-object tracking with robust object regression and association'. Together they form a unique fingerprint.

Cite this