Towards deployment-centric multimodal AI beyond vision and language

Xianyuan Liu, Jiayang Zhang, Shuo Zhou, Thijs L. van der Plas, Avish Vijayaraghavan, Anastasiia Grishina, Mengdie Zhuang, Daniel Schofield, Christopher Tomlinson, Yuhan Wang, Ruizhe Li, Louisa van Zeeland, Sina Tabakhi, Cyndie Demeocq, Xiang Li, Arunav Das, Orlando Timmerman, Thomas Baldwin-McDonald, Jinge Wu, Peizhen BaiZahraa Al Sahili, Omnia Alwazzan, Thao N. Do, Mohammod N. I. Suvon, Angeline Wang, Lucia Cipolina-Kun, Luigi A. Moretti, Lucas Farndale, Nitisha Jain, Natalia Efremova, Yan Ge, Marta Varela, Hak-Keung Lam, Oya Celiktutan, Ben R. Evans, Alejandro Coca-Castro, Honghan Wu, Zahraa S. Abdallah, Chen Chen, Valentin Danchev, Nataliya Tkachenko, Lei Lu, Tingting Zhu, Gregory G. Slabaugh, Roger K. Moore, William K. Cheung, Peter H. Charlton, Haiping Lu

Research output: Contribution to journalReview articlepeer-review

1 Citation (SciVal)
1 Downloads (Pure)

Abstract

Multimodal artificial intelligence (AI) integrates diverse types of data via machine learning to improve understanding, prediction and decision-making across disciplines such as healthcare, science and engineering. However, most multimodal AI advances focus on models for vision and language data, and their deployability remains a key challenge. We advocate a deployment-centric workflow that incorporates deployment constraints early on to reduce the likelihood of undeployable solutions, complementing data-centric and model-centric approaches. We also emphasize deeper integration across multiple levels of multimodality through stakeholder engagement and interdisciplinary collaboration to broaden the research scope beyond vision and language. To facilitate this approach, we identify common multimodal-AI-specific challenges shared across disciplines and examine three real-world use cases: pandemic response, self-driving car design and climate change adaptation, drawing expertise from healthcare, social science, engineering, science, sustainability and finance. By fostering interdisciplinary dialogue and open research practices, our community can accelerate deployment-centric development for broad societal impact.
Original languageEnglish
Pages (from-to)1612-1624
Number of pages13
JournalNature Machine Intelligence
Volume7
Issue number10
Early online date21 Oct 2025
DOIs
Publication statusPublished - 31 Oct 2025

Data Availability Statement

Source data are provided with this paper. They are also available at
https://github.com/multimodalAI/multimodal-ai-landscape, where
they will be updated annually.

Acknowledgements

This work was enabled and supported by the Alan Turing Institute. We thank T. Chakraborty and C. Li for inspiring this work, D. A. Clifton for his support and T. Dunstan for contributing to the climate change adaptation section. The views expressed in this material are those of the authors and do not necessarily represent the views of their affiliated institutions or funders.

Funding

J.Z. is supported by donations from D. Naik and S. Naik. S.Z. is supported by EPSRC (grant EP/Y017544/1). T.L.v.d.P. was supported by EPSRC (grant EP/Y028880/1). A.V. is supported by UKRI CDT in AI for Healthcare (grant EP/S023283/1). A.G. is supported by the Research Council of Norway (secureIT project 288787). M.Z. is supported by EPSRC (grant EP/X031276/1). C.T. is supported by UKRI CDT in AI-enabled Healthcare (grant EP/S021612/1). R.L. is supported by the Royal Society (grant IEC\NSFC\233558). L.v.Z. is supported by NERC (grant NE/W004747/1). O.T. is supported by UKRI CDT in Application of Artificial Intelligence to the study of Environmental Risks (grant EP/S022961/1). Z.A.S. is supported by Google DeepMind. O.A. is supported by NIHR Barts BRC (grant NIHR203330). T.N.D. is supported by UKRI CDT in Accountable, Responsible and Transparent AI (grant EP/S023437/1). L.F. is supported by MRC (grant MR/W006804/1). N.J. is supported by the EU’s co-funded HE project MuseIT (grant 101061441). M.V. is supported by St George’s Hospital Charity. A.C.-C. is supported by EPSRC (grant EP/Y028880/1). H.W. is supported by MRC (grant MR/X030075/1). C.C. is supported by the Royal Society (grant GS\R2\242355). T.Z. was supported by the Royal Academy of Engineering (grant RF\201819\18\109). G.G.S. is supported by EPSRC (grant EP/Y009800/1). P.H.C. is supported by BHF (grant FS/20/20/34626). H.L. is supported by EPSRC (grant UKRI396).

Fingerprint

Dive into the research topics of 'Towards deployment-centric multimodal AI beyond vision and language'. Together they form a unique fingerprint.

Cite this