Machine Learning-Enhanced Quantum Chemistry-Assisted Refinement of the Active Site Structure of Metalloproteins

Lucia Gigli, José Malanho Silva, Linda Cerofolini, Anjos L. Macedo, Carlos F.G.C. Geraldes, Elizaveta A. Suturina, Vito Calderone, Marco Fragai, Giacomo Parigi, Enrico Ravera, Claudio Luchinat

Research output: Contribution to journalArticlepeer-review


Understanding the fine structural details of inhibitor binding at the active site of metalloenzymes can have a profound impact on the rational drug design targeted to this broad class of biomolecules. Structural techniques such as NMR, cryo-EM, and X-ray crystallography can provide bond lengths and angles, but the uncertainties in these measurements can be as large as the range of values that have been observed for these quantities in all the published structures. This uncertainty is far too large to allow for reliable calculations at the quantum chemical (QC) levels for developing precise structure-activity relationships or for improving the energetic considerations in protein-inhibitor studies. Therefore, the need arises to rely upon computational methods to refine the active site structures well beyond the resolution obtained with routine application of structural methods. In a recent paper, we have shown that it is possible to refine the active site of cobalt(II)-substituted MMP12, a metalloprotein that is a relevant drug target, by matching to the experimental pseudocontact shifts (PCS) those calculated using multireference ab initio QC methods. The computational cost of this methodology becomes a significant bottleneck when the starting structure is not sufficiently close to the final one, which is often the case with biomolecular structures. To tackle this problem, we have developed an approach based on a neural network (NN) and a support vector regression (SVR) and applied it to the refinement of the active site structure of oxalate-inhibited human carbonic anhydrase 2 (hCAII), another prototypical metalloprotein target. The refined structure gives a remarkably good agreement between the QC-calculated and the experimental PCS. This study not only contributes to the knowledge of CAII but also demonstrates the utility of combining machine learning (ML) algorithms with QC calculations, offering a promising avenue for investigating other drug targets and complex biological systems in general.

Original languageEnglish
Pages (from-to)10713-10725
Number of pages13
JournalInorganic Chemistry
Issue number23
Early online date28 May 2024
Publication statusPublished - 10 Jun 2024

ASJC Scopus subject areas

  • Physical and Theoretical Chemistry
  • Inorganic Chemistry

Cite this