TY - JOUR
T1 - Determinants of the usage of splice-associated cis-motifs predict the distribution of human pathogenic SNPs
AU - Wu, XianMing
AU - Hurst, Laurence D.
PY - 2016/2
Y1 - 2016/2
N2 - Where in genes do pathogenic mutations tend to occur and does this provide clues as to the possible underlying mechanisms by which single nucleotide polymorphisms (SNPs) cause disease? As splice-disrupting mutations tend to occur predominantly at exon ends, known also to be hot spots of cis-exonic splice control elements, we examine the relationship between the relative density of such exonic cis-motifs and pathogenic SNPs. In particular we focus on the intragene distribution of exonic splicing enhancers (ESE) and the covariance between them and disease-associated SNPs. In addition to showing that disease-causing genes tend to be genes with a high intron density, consistent with missplicing, five factors established as trends in ESE usage, are considered: relative position in exons, relative position in genes, flanking intron size, splice sites usage, and phase. We find that more than 76% of pathogenic SNPs are within 3-69 bp of exon ends where ESEs generally reside, this being 13% more than expected. Overall from enrichment of pathogenic SNPs at exon ends, we estimate that circa 20-45% of SNPs affect splicing. Importantly, we find that within genes pathogenic SNPs tend to occur in splicing-relevant regions with low ESE density: they are found to occur preferentially in the terminal half of genes, in exons flanked by short introns and at the ends of phase (0,0) exons with 3’ non-“AGgt” splice site. We suggest the concept of the “fragile” exon, one home to pathogenic SNPs owing to its vulnerability to splice disruption owing to low ESE density.
AB - Where in genes do pathogenic mutations tend to occur and does this provide clues as to the possible underlying mechanisms by which single nucleotide polymorphisms (SNPs) cause disease? As splice-disrupting mutations tend to occur predominantly at exon ends, known also to be hot spots of cis-exonic splice control elements, we examine the relationship between the relative density of such exonic cis-motifs and pathogenic SNPs. In particular we focus on the intragene distribution of exonic splicing enhancers (ESE) and the covariance between them and disease-associated SNPs. In addition to showing that disease-causing genes tend to be genes with a high intron density, consistent with missplicing, five factors established as trends in ESE usage, are considered: relative position in exons, relative position in genes, flanking intron size, splice sites usage, and phase. We find that more than 76% of pathogenic SNPs are within 3-69 bp of exon ends where ESEs generally reside, this being 13% more than expected. Overall from enrichment of pathogenic SNPs at exon ends, we estimate that circa 20-45% of SNPs affect splicing. Importantly, we find that within genes pathogenic SNPs tend to occur in splicing-relevant regions with low ESE density: they are found to occur preferentially in the terminal half of genes, in exons flanked by short introns and at the ends of phase (0,0) exons with 3’ non-“AGgt” splice site. We suggest the concept of the “fragile” exon, one home to pathogenic SNPs owing to its vulnerability to splice disruption owing to low ESE density.
UR - http://dx.doi.org/10.1093/molbev/msv251
UR - http://dx.doi.org/10.1093/molbev/msv251
U2 - 10.1093/molbev/msv251
DO - 10.1093/molbev/msv251
M3 - Article
SN - 0737-4038
VL - 33
SP - 518
EP - 529
JO - Molecular Biology and Evolution
JF - Molecular Biology and Evolution
IS - 2
ER -