Eukaryotic mRNAs are headed by a stretch of noncoding sequence, the 5' untranslated region (UTR). It has been proposed that the length of 5' UTRs is selectively neutral and evolves under a process of stochastic destruction and recruitment of core promoter elements, combined with selection against the premature initiation of translation. We test this null model by investigating whether 5' UTR length varies with genomic GC content, an implicit prediction of the model. Using simulations, we show that the null model predicts a positive relationship between GC content and UTR length for genes regulated by a TATA box. Although this prediction is borne out qualitatively in genomic data from yeast, fruit flies, and humans, we find marked quantitative discrepancies. We conclude that UTR length may be shaped to some degree by the forces considered in the null model but that the model fails to provide a complete explanation for UTR length evolution.
Reuter, M., Engelstadter, J., Fontanillas, P., & Hurst, L. D. (2008). A test of the null model for 5 ' UTR evolution based on GC content. Molecular Biology and Evolution, 25(5), 801-804. https://doi.org/10.1093/molbev/msn044