MF-Net: A Multimodal Fusion Model for Fast Multi-object Tracking

Shirui Tian, Mingxing Duan, Jiayan Deng, Huizhang Luo, Yikun Hu

Research output: Contribution to journalArticlepeer-review


In the realm of multimodal multi-object tracking (MOT) applications based on point clouds and images, the current research predominantly focuses on enhancing tracking accuracy, often neglecting the issue of computational efficiency. Consequently, these models often struggle to exhibit optimal tracking capabilities in scenarios demanding high real-time performance. To address these challenges, this paper introduces a fast multi-object tracking model based on multimodal fusion (MF-Net). The model is divided into three primary modules: object detection, multimodal fusion, and trajectory matching. Firstly, a 2D detector is used to identify objects in the image and compute their posterior estimate, and a 3D classification network extracts the foreground points of the object from the point cloud. Subsequently, a perspective projection module is then designed to determine the transformation matrix and the minimum number of vertex pairs that map the coordinates of the foreground points onto a 2D plane. Based on the model, a Planar Gaussian Function (PGF) model was constructed to fit small and hard objects that were missed in the image according to the foreground points, thus compensating for the limitations of 2D detectors and ensuring accuracy while reducing training time. Finally, the merged object performs trajectory matching. The performance of MF-Net has been verified through experiments in plenty conducted on publicly available KITTI and nuScenes datasets. In comparison to existing competitive models, our algorithm demonstrates a substantial enhancement in both detection and tracking performance, achieving satisfactory accuracy but showcasing superior real-time efficiency. The MF-Net&#x0027;s source code is obtained at <uri></uri>.

Original languageEnglish (US)
Pages (from-to)1-14
Number of pages14
JournalIEEE Transactions on Vehicular Technology
StateAccepted/In press - 2024
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Automotive Engineering
  • Aerospace Engineering
  • Computer Networks and Communications
  • Electrical and Electronic Engineering


  • Computational modeling
  • Gaussian Function
  • Hidden Markov models
  • Mathematical models
  • Multi-object Tracking
  • Multimodel Fusion
  • Object Detection
  • Object detection
  • Point cloud compression
  • Task analysis
  • Three-dimensional displays
  • Trajectory Matching


Dive into the research topics of 'MF-Net: A Multimodal Fusion Model for Fast Multi-object Tracking'. Together they form a unique fingerprint.

Cite this