Abstract
Underwater computer vision faces significant challenges from light scattering, absorption, and poor illumination, which severely impact underwater vision tasks. To address these issues, ViT-Clarity, an underwater image enhancement module, is introduced, which integrates vision transformers with a convolutional neural network for superior performance. For comparison, ClarityNet, a transformer-free variant of the architecture, is presented to highlight the transformer’s impact. Given the limited availability of paired underwater image datasets (clear and degraded), BlueStyleGAN is proposed as a generative model to create synthetic underwater images from clear in-air images by simulating realistic attenuation effects. BlueStyleGAN is evaluated against existing state-of-the-art synthetic dataset generators in terms of training stability and realism. Vit-ClarityNet is rigorously tested on five datasets representing diverse underwater conditions and compared with recent state-of-the-art methods as well as ClarityNet. Evaluations include qualitative and quantitative metrics such as UCIQM, UCIQE, and the deep learning-based URanker. Additionally, the impact of enhanced images on object detection and SIFT feature matching is assessed, demonstrating the practical benefits of image enhancement for underwater computer vision tasks.
Original language | English |
---|---|
Article number | 16768 |
Journal | Scientific Reports |
Volume | 15 |
Issue number | 1 |
Early online date | 14 May 2025 |
DOIs | |
Publication status | Published - 14 May 2025 |
Data Availability Statement
The datasets generated and/or analyzed during this study are available from the corresponding author upon reasonable request. The images sourced from external datasets, including USOD [45], UIEB [22], EUVP [29], U45 [44], SUIM [43], SQUID [35], Aquarium[37], and KITTI [34], are publicly accessible through their respective websites. Furthermore, the enhanced outputs obtained for Vit-ClarityNet’s evaluation, as well as the synthetic underwater image datasets generated by BlueStyleGAN for training purposes, are publicly available at https://github.com/ME-1997/Vit-ClarityNet or can be requested from the corresponding author.Keywords
- Convolutional neural networks
- Generative adversarial
- Synthetic dataset
- Underwater image enhancement
- Vision transformer
ASJC Scopus subject areas
- General