Transformer-based multimodal change detection with multitask consistency constraints

Biyuan Liu, Huaixin Chen, Kun Li, Michael Yang

Research output: Contribution to journalArticlepeer-review


Change detection plays a fundamental role in Earth observation for analyzing temporal iterations over time. However, recent studies have largely neglected the utilization of multimodal data that presents significant practical and technical advantages compared to single-modal approaches. This research focuses on leveraging pre-event digital surface model (DSM) data and post-event digital aerial images captured at different times for detecting change beyond 2D. We observe that the current change detection methods struggle with the multitask conflicts between semantic and height change detection tasks. To address this challenge, we propose an efficient Transformer-based network that learns shared representation between cross-dimensional inputs through cross-attention. It adopts a consistency constraint to establish the multimodal relationship. Initially, pseudo-changes are derived by employing height change thresholding. Subsequently, the L2 distance between semantic and pseudo-changes within their overlapping regions is minimized. This explicitly endows the height change detection (regression task) and semantic change detection (classification task) with representation consistency. A DSM-to-image multimodal dataset encompassing three cities in the Netherlands was constructed. It lays a new foundation for beyond-2D change detection from cross-dimensional inputs. Compared to five state-of-the-art change detection methods, our model demonstrates consistent multitask superiority in terms of semantic and height change detection. Furthermore, the consistency strategy can be seamlessly adapted to the other methods, yielding promising improvements.

Original languageEnglish
Article number102358
JournalInformation Fusion
Early online date25 Mar 2024
Publication statusE-pub ahead of print - 25 Mar 2024

Data Availability Statement

I have shared the data at the link in the manuscript


  • Change detection
  • Height change
  • Multimodal
  • Multitask consistency
  • Transformer-based

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Signal Processing
  • Hardware and Architecture


Dive into the research topics of 'Transformer-based multimodal change detection with multitask consistency constraints'. Together they form a unique fingerprint.

Cite this