On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances, and Million-AID

Yang Long, Gui Song Xia, Shengyang Li, Wen Yang, Michael Ying Yang, Xiao Xiang Zhu, Liangpei Zhang, Deren Li

Research output: Contribution to journalArticlepeer-review

88 Citations (SciVal)


The past years have witnessed great progress on remote sensing (RS) image interpretation and its wide applications. With RS images becoming more accessible than ever before, there is an increasing demand for the automatic interpretation of these images. In this context, the benchmark datasets serve as an essential prerequisites for developing and testing intelligent interpretation algorithms. After reviewing existing benchmark datasets in the research community of RS image interpretation, this article discusses the problem of how to efficiently prepare a suitable benchmark dataset for RS image interpretation. Specifically, we first analyze the current challenges of developing intelligent algorithms for RS image interpretation with bibliometric investigations. We then present the general guidances on creating benchmark datasets in efficient manners. Following the presented guidances, we also provide an example on building RS image dataset, i.e., Million Aerial Image Dataset (Online. Available: https://captain-whu.github.io/DiRS/0), a new large-scale benchmark dataset containing a million instances for RS image scene classification. Several challenges and perspectives in RS image annotation are finally discussed to facilitate the research in benchmark dataset construction. We do hope this article will provide the RS community an overall perspective on constructing large-scale and practical image datasets for further research, especially data-driven ones.

Original languageEnglish
Article number9393553
Pages (from-to)4205-4230
Number of pages26
JournalIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Early online date1 Apr 2021
Publication statusPublished - 1 Apr 2021


  • Annotation
  • benchmark datasets
  • Million Aerial Image Dataset (Million-AID)
  • remote sensing (RS) image interpretation
  • scene classification

ASJC Scopus subject areas

  • Computers in Earth Sciences
  • Atmospheric Science


Dive into the research topics of 'On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances, and Million-AID'. Together they form a unique fingerprint.

Cite this