DVLO4D: Deep Visual-Lidar Odometry with Sparse Spatial-temporal Fusion

Mengmeng Liu, Michael Yang, Jiuming Liu, Yunpeng Zhang, Jiangtao Li, Sander Oude Elberink, George Vosselman, Hao Cheng

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

Abstract

Visual-LiDAR odometry is a critical component for autonomous system localization, yet achieving high accuracy and strong robustness remains a challenge. Traditional approaches commonly struggle with sensor misalignment, fail to fully leverage temporal information, and require extensive manual tuning to handle diverse sensor configurations. To address these problems, we introduce DVLO4D, a novel visual-LiDAR odometry framework that leverages sparse spatial-temporal fusion to enhance accuracy and robustness. Our approach proposes three key innovations: (1) Sparse Query Fusion, which utilizes sparse LiDAR queries for effective multi-modal data fusion; (2) a Temporal Interaction and Update module that integrates temporally-predicted positions with current frame data, providing better initialization values for pose estimation and enhancing model's robustness against accumulative errors; and (3) a Temporal Clip Training strategy combined with a Collective Average Loss mechanism that aggregates losses across multiple frames, enabling global optimization and reducing the scale drift over long sequences. Extensive experiments on the KITTI and Argoverse Odometry dataset demonstrate the superiority of our proposed DVLO4D, which achieves state-of-the-art performance in terms of both pose accuracy and robustness. Additionally, our method has high efficiency, with an inference time of 82 ms, possessing the potential for the real-time deployment.
Original languageEnglish
Title of host publication2025 IEEE International Conference on Robotics and Automation, ICRA 2025
EditorsChristian Ott, Henny Admoni, Sven Behnke, Stjepan Bogdan, Aude Bolopion, Youngjin Choi, Fanny Ficuciello, Nicholas Gans, Clement Gosselin, Kensuke Harada, Erdal Kayacan, H. Jin Kim, Stefan Leutenegger, Zhe Liu, Perla Maiolino, Lino Marques, Takamitsu Matsubara, Anastasia Mavromatti, Mark Minor, Jason O'Kane, Hae Won Park, Hae-Won Park, Ioannis Rekleitis, Federico Renda, Elisa Ricci, Laurel D. Riek, Lorenzo Sabattini, Shaojie Shen, Yu Sun, Pierre-Brice Wieber, Katsu Yamane, Jingjin Yu
Place of PublicationU. S. A.
PublisherIEEE
Pages9740-9747
Number of pages8
ISBN (Electronic)9798331541392
DOIs
Publication statusPublished - 23 May 2025
Event2025 IEEE International Conference on Robotics and Automation (ICRA) - Atlanta, USA United States
Duration: 19 May 202523 May 2025

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Conference

Conference2025 IEEE International Conference on Robotics and Automation (ICRA)
Country/TerritoryUSA United States
CityAtlanta
Period19/05/2523/05/25

Funding

This work was partially supported by the EU HORIZON-CL42023-HUMAN-01-CNECT XTREME (grant no. 101136006), and the Sectorplan Beta-II of the Netherlands.

FundersFunder number
European Commission101136006

Fingerprint

Dive into the research topics of 'DVLO4D: Deep Visual-Lidar Odometry with Sparse Spatial-temporal Fusion'. Together they form a unique fingerprint.

Cite this