TY - JOUR
T1 - Causes and consequences of purifying selection on SARS-CoV-2
AU - Castillo Morales, Atahualpa
AU - Rice, Alan M.
AU - Ho, Alex
AU - Mordstein, Christine
AU - Mühlhausen, Stefanie
AU - Watson, Samir
AU - Cano, Laura
AU - Young, Bethan
AU - Kudla, Grzegorz
AU - Hurst, Laurence
PY - 2021/10/31
Y1 - 2021/10/31
N2 - Owing to a lag between a deleterious mutation’s appearance and its selective removal, gold-standard methods for mutation rate estimation assume no meaningful loss of mutationsbetweenparentsandoffspring.Indeed,fromanalysisofcloselyrelatedlineages, in SARS-CoV-2 the Ka/Ks ratio was previously estimated as 1.008, suggesting no within- host selection. By contrast, we find a higher number of observed SNPs at 4-fold degenerate sites than elsewhere and, allowing for the virus’s complex mutational and compositional biases, estimate that the mutation rate is at least 49-67% higher than would be estimated based on the rate of appearance of variants in sampled genomes. Given the high Ka/Ks one might assume that the majority of such intra-host selection is the purging of nonsense mutations. However, we estimate that selection against nonsense mutations accounts for only ~10% of all the “missing” mutations. Instead, classical protein-level selective filters (against chemically disparate amino acids and those predicted to disrupt protein functionality) account for many missing mutations. It is less obvious why for an intracellular parasite, amino acid cost parameters, notably amino acid decay rate, are also significant. Perhaps most surprisingly, we also find evidence for real time selection against synonymous mutations that move codon usage away from that of humans. We conclude that there is common intra-host selection on SARS-CoV-2 that acts on nonsense, missense and possibly synonymous mutations. This has implications for methods of mutation rate estimation, for determining times to common ancestry and the potential for intra-host evolution including vaccine escape.
AB - Owing to a lag between a deleterious mutation’s appearance and its selective removal, gold-standard methods for mutation rate estimation assume no meaningful loss of mutationsbetweenparentsandoffspring.Indeed,fromanalysisofcloselyrelatedlineages, in SARS-CoV-2 the Ka/Ks ratio was previously estimated as 1.008, suggesting no within- host selection. By contrast, we find a higher number of observed SNPs at 4-fold degenerate sites than elsewhere and, allowing for the virus’s complex mutational and compositional biases, estimate that the mutation rate is at least 49-67% higher than would be estimated based on the rate of appearance of variants in sampled genomes. Given the high Ka/Ks one might assume that the majority of such intra-host selection is the purging of nonsense mutations. However, we estimate that selection against nonsense mutations accounts for only ~10% of all the “missing” mutations. Instead, classical protein-level selective filters (against chemically disparate amino acids and those predicted to disrupt protein functionality) account for many missing mutations. It is less obvious why for an intracellular parasite, amino acid cost parameters, notably amino acid decay rate, are also significant. Perhaps most surprisingly, we also find evidence for real time selection against synonymous mutations that move codon usage away from that of humans. We conclude that there is common intra-host selection on SARS-CoV-2 that acts on nonsense, missense and possibly synonymous mutations. This has implications for methods of mutation rate estimation, for determining times to common ancestry and the potential for intra-host evolution including vaccine escape.
U2 - 10.1093/gbe/evab196
DO - 10.1093/gbe/evab196
M3 - Article
SN - 1759-6653
VL - 13
JO - Genome biology and evolution
JF - Genome biology and evolution
IS - 10
M1 - evab196
ER -