Abstract
Underwater computer vision faces significant challenges from light scattering, absorption, and poor illumination, which severely impact underwater vision tasks. To address these issues, ViT-Clarity, an underwater image enhancement module, is introduced, which integrates vision transformers with a convolutional neural network for superior performance. For comparison, ClarityNet, a transformer-free variant of the architecture, is presented to highlight the transformer’s impact. Given the limited availability of paired underwater image datasets (clear and degraded), BlueStyleGAN is proposed as a generative model to create synthetic underwater images from clear in-air images by simulating realistic attenuation effects. BlueStyleGAN is evaluated against existing state-of-the-art synthetic dataset generators in terms of training stability and realism. Vit-ClarityNet is rigorously tested on five datasets representing diverse underwater conditions and compared with recent state-of-the-art methods as well as ClarityNet. Evaluations include qualitative and quantitative metrics such as UCIQM, UCIQE, and the deep learning-based URanker. Additionally, the impact of enhanced images on object detection and SIFT feature matching is assessed, demonstrating the practical benefits of image enhancement for underwater computer vision tasks.
| Original language | English |
|---|---|
| Article number | 16768 |
| Journal | Scientific Reports |
| Volume | 15 |
| Issue number | 1 |
| Early online date | 14 May 2025 |
| DOIs | |
| Publication status | Published - 14 May 2025 |
Data Availability Statement
The datasets generated and/or analyzed during this study are available from the corresponding author upon reasonable request. The images sourced from external datasets, including USOD [45], UIEB [22], EUVP [29], U45 [44], SUIM [43], SQUID [35], Aquarium[37], and KITTI [34], are publicly accessible through their respective websites. Furthermore, the enhanced outputs obtained for Vit-ClarityNet’s evaluation, as well as the synthetic underwater image datasets generated by BlueStyleGAN for training purposes, are publicly available at https://github.com/ME-1997/Vit-ClarityNet or can be requested from the corresponding author.Keywords
- Convolutional neural networks
- Generative adversarial
- Synthetic dataset
- Underwater image enhancement
- Vision transformer
ASJC Scopus subject areas
- General
Fingerprint
Dive into the research topics of 'A vision transformer based CNN for underwater image enhancement ViTClarityNet'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS