Projects per year
Abstract
Machine learning (ML) models can, once trained, make reaction barrier predictions in seconds, which is orders of magnitude faster than quantum mechanical (QM) methods such as density functional theory (DFT). However, these ML models need to be trained on large datasets of typically thousands of expensive, high accuracy barriers and do not generalise well beyond the specific reaction for which they are trained. In this work, we demonstrate that transfer learning (TL) can be used to adapt pre-trained Diels–Alder barrier prediction neural networks (NNs) to make predictions for other pericyclic reactions using horizontal TL (hTL) and additionally, at higher levels of theory with diagonal TL (dTL). TL-derived predictions are possible with mean absolute errors (MAEs) below the accepted chemical accuracy threshold of 1 kcal mol−1, a significant improvement on pre-TL prediction MAEs of >5 kcal mol−1, and in extremely low data regimes, with as few as 33 and 39 new datapoints needed for hTL and dTL, respectively. Thus, hTL and dTL are powerful options for providing insight into reaction feasibility without the need for extensive high-throughput experimental or computational screening or large dataset generation for training bespoke ML models.
Original language | English |
---|---|
Pages (from-to) | 941-951 |
Number of pages | 11 |
Journal | Digital Discovery |
Volume | 2 |
Issue number | 4 |
Early online date | 31 May 2023 |
DOIs | |
Publication status | Published - 1 Aug 2023 |
Bibliographical note
Funding Information:The authors gratefully acknowledge the University of Bath's Research Computing Group (https://doi.org/10.15125/b6cd-s854) for their support in this work; this research made use of both the Balena and Anatra High Performance Computing (HPC) service at the University of Bath. The authors thank the EPSRC (EP/W003724/1, EP/V519637/1 and EP/R513155/1), the University of Bath and AstraZeneca for funding.
Data availability
Gaussian 16 computed output files and code from this work is available in Dataset for “Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach” in the University of Bath Research Data Archive (accessible at: https://doi.org/10.15125/BATH-01229).
Fingerprint
Dive into the research topics of 'Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Machine Learning and Molecular Modelling: A Synergistic Approach to Rapid Reactivity Prediction
Grayson, M. (PI)
Engineering and Physical Sciences Research Council
1/07/22 → 30/06/24
Project: Research council
-
Automation, cloud computing and artificial intelligence for reaction optimisation
Grayson, M. (PI)
28/09/20 → 30/09/24
Project: UK industry
Datasets
-
Dataset for "Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach"
Espley, S. (Creator), Farrar, E. (Creator), Grayson, M. (Supervisor), Tomasi, S. (Supervisor) & Buttar, D. (Supervisor), University of Bath, 31 May 2023
DOI: 10.15125/BATH-01229
Dataset