Can codon usage bias explain intron phase distributions and exon symmetry?

A Ruvinsky, S T Eskesen, F N Eskesen, L D Hurst

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

More introns exist between codons (phase 0) than between the first and the second bases (phase 1) or between the second and the third base (phase 2) within the codon. Many explanations have been suggested for this excess of phase 0. It has, for example, been argued to reflect an ancient utility for introns in separating exons that code for separate protein modules. There may, however, be a simple, alternative explanation. Introns typically require, for correct splicing, particular nucleotides immediately 5' in exons (typically a G) and immediately 3' in the following exon (also often a G). Introns therefore tend to be found between particular nucleotide pairs (e.g., GIG pairs) in the coding sequence. If. owing, to bias in usage of different codons, these pairs are especially common at phase 0, then intron phase biases may have a trivial explanation. Here we take codon usage frequencies for a variety of eukaryotes and use these to generate random sequences. We then ask about the phase of putative intron insertion sites. Importantly, in all simulated data sets intron phase distribution is biased in favor of phase 0. In many cases the bias is of the magnitude observed in real data and can be attributed to codon usage bias. It is also known that exons may carry either the same phase (symmetric) or different phases (asymmetric) at the opposite ends. We simulated a distribution of different types of exons using frequencies of introns observed in real genes assuming random combination of intron phases at the opposite sides of exons. Surprisingly the simulated pattern was quite similar to that observed. In the simulants we typically observe a prevalence of symmetric exons carrying phase 0 at both ends. which is common for eukaryotic genes. However. at least In some species. the extent of the bias in favor of symmetric (0.0) exons is not as great in simulants as in real genes. These results emphasize the need to construct a biologically relevant null model of successful intron insertion.
Original languageEnglish
Pages (from-to)99-104
Number of pages6
JournalJournal of Molecular Evolution
Volume60
Issue number1
DOIs
Publication statusPublished - 2005

Fingerprint

codons
Codon
Introns
exons
symmetry
introns
Exons
gene
eukaryote
protein
Nucleotides
nucleotides
distribution
Genes
genes
Eukaryota
eukaryotic cells
code

Cite this

Can codon usage bias explain intron phase distributions and exon symmetry? / Ruvinsky, A; Eskesen, S T; Eskesen, F N; Hurst, L D.

In: Journal of Molecular Evolution, Vol. 60, No. 1, 2005, p. 99-104.

Research output: Contribution to journalArticle

Ruvinsky, A ; Eskesen, S T ; Eskesen, F N ; Hurst, L D. / Can codon usage bias explain intron phase distributions and exon symmetry?. In: Journal of Molecular Evolution. 2005 ; Vol. 60, No. 1. pp. 99-104.
@article{8f71b880f3fa4bb4bc1f9032254f654f,
title = "Can codon usage bias explain intron phase distributions and exon symmetry?",
abstract = "More introns exist between codons (phase 0) than between the first and the second bases (phase 1) or between the second and the third base (phase 2) within the codon. Many explanations have been suggested for this excess of phase 0. It has, for example, been argued to reflect an ancient utility for introns in separating exons that code for separate protein modules. There may, however, be a simple, alternative explanation. Introns typically require, for correct splicing, particular nucleotides immediately 5' in exons (typically a G) and immediately 3' in the following exon (also often a G). Introns therefore tend to be found between particular nucleotide pairs (e.g., GIG pairs) in the coding sequence. If. owing, to bias in usage of different codons, these pairs are especially common at phase 0, then intron phase biases may have a trivial explanation. Here we take codon usage frequencies for a variety of eukaryotes and use these to generate random sequences. We then ask about the phase of putative intron insertion sites. Importantly, in all simulated data sets intron phase distribution is biased in favor of phase 0. In many cases the bias is of the magnitude observed in real data and can be attributed to codon usage bias. It is also known that exons may carry either the same phase (symmetric) or different phases (asymmetric) at the opposite ends. We simulated a distribution of different types of exons using frequencies of introns observed in real genes assuming random combination of intron phases at the opposite sides of exons. Surprisingly the simulated pattern was quite similar to that observed. In the simulants we typically observe a prevalence of symmetric exons carrying phase 0 at both ends. which is common for eukaryotic genes. However. at least In some species. the extent of the bias in favor of symmetric (0.0) exons is not as great in simulants as in real genes. These results emphasize the need to construct a biologically relevant null model of successful intron insertion.",
author = "A Ruvinsky and Eskesen, {S T} and Eskesen, {F N} and Hurst, {L D}",
note = "ID number: ISI:000226136900009",
year = "2005",
doi = "10.1007/s00239-004-0032-9",
language = "English",
volume = "60",
pages = "99--104",
journal = "Journal of Molecular Evolution",
issn = "0022-2844",
publisher = "Springer New York",
number = "1",

}

TY - JOUR

T1 - Can codon usage bias explain intron phase distributions and exon symmetry?

AU - Ruvinsky, A

AU - Eskesen, S T

AU - Eskesen, F N

AU - Hurst, L D

N1 - ID number: ISI:000226136900009

PY - 2005

Y1 - 2005

N2 - More introns exist between codons (phase 0) than between the first and the second bases (phase 1) or between the second and the third base (phase 2) within the codon. Many explanations have been suggested for this excess of phase 0. It has, for example, been argued to reflect an ancient utility for introns in separating exons that code for separate protein modules. There may, however, be a simple, alternative explanation. Introns typically require, for correct splicing, particular nucleotides immediately 5' in exons (typically a G) and immediately 3' in the following exon (also often a G). Introns therefore tend to be found between particular nucleotide pairs (e.g., GIG pairs) in the coding sequence. If. owing, to bias in usage of different codons, these pairs are especially common at phase 0, then intron phase biases may have a trivial explanation. Here we take codon usage frequencies for a variety of eukaryotes and use these to generate random sequences. We then ask about the phase of putative intron insertion sites. Importantly, in all simulated data sets intron phase distribution is biased in favor of phase 0. In many cases the bias is of the magnitude observed in real data and can be attributed to codon usage bias. It is also known that exons may carry either the same phase (symmetric) or different phases (asymmetric) at the opposite ends. We simulated a distribution of different types of exons using frequencies of introns observed in real genes assuming random combination of intron phases at the opposite sides of exons. Surprisingly the simulated pattern was quite similar to that observed. In the simulants we typically observe a prevalence of symmetric exons carrying phase 0 at both ends. which is common for eukaryotic genes. However. at least In some species. the extent of the bias in favor of symmetric (0.0) exons is not as great in simulants as in real genes. These results emphasize the need to construct a biologically relevant null model of successful intron insertion.

AB - More introns exist between codons (phase 0) than between the first and the second bases (phase 1) or between the second and the third base (phase 2) within the codon. Many explanations have been suggested for this excess of phase 0. It has, for example, been argued to reflect an ancient utility for introns in separating exons that code for separate protein modules. There may, however, be a simple, alternative explanation. Introns typically require, for correct splicing, particular nucleotides immediately 5' in exons (typically a G) and immediately 3' in the following exon (also often a G). Introns therefore tend to be found between particular nucleotide pairs (e.g., GIG pairs) in the coding sequence. If. owing, to bias in usage of different codons, these pairs are especially common at phase 0, then intron phase biases may have a trivial explanation. Here we take codon usage frequencies for a variety of eukaryotes and use these to generate random sequences. We then ask about the phase of putative intron insertion sites. Importantly, in all simulated data sets intron phase distribution is biased in favor of phase 0. In many cases the bias is of the magnitude observed in real data and can be attributed to codon usage bias. It is also known that exons may carry either the same phase (symmetric) or different phases (asymmetric) at the opposite ends. We simulated a distribution of different types of exons using frequencies of introns observed in real genes assuming random combination of intron phases at the opposite sides of exons. Surprisingly the simulated pattern was quite similar to that observed. In the simulants we typically observe a prevalence of symmetric exons carrying phase 0 at both ends. which is common for eukaryotic genes. However. at least In some species. the extent of the bias in favor of symmetric (0.0) exons is not as great in simulants as in real genes. These results emphasize the need to construct a biologically relevant null model of successful intron insertion.

U2 - 10.1007/s00239-004-0032-9

DO - 10.1007/s00239-004-0032-9

M3 - Article

VL - 60

SP - 99

EP - 104

JO - Journal of Molecular Evolution

JF - Journal of Molecular Evolution

SN - 0022-2844

IS - 1

ER -