Determinants of the usage of splice-associated cis-motifs predict the distribution of human pathogenic SNPs

XianMing Wu, Laurence D. Hurst

Research output: Contribution to journalArticlepeer-review

39 Citations (SciVal)
176 Downloads (Pure)

Abstract

Where in genes do pathogenic mutations tend to occur and does this provide clues as to the possible underlying mechanisms by which single nucleotide polymorphisms (SNPs) cause disease? As splice-disrupting mutations tend to occur predominantly at exon ends, known also to be hot spots of cis-exonic splice control elements, we examine the relationship between the relative density of such exonic cis-motifs and pathogenic SNPs. In particular we focus on the intragene distribution of exonic splicing enhancers (ESE) and the covariance between them and disease-associated SNPs. In addition to showing that disease-causing genes tend to be genes with a high intron density, consistent with missplicing, five factors established as trends in ESE usage, are considered: relative position in exons, relative position in genes, flanking intron size, splice sites usage, and phase. We find that more than 76% of pathogenic SNPs are within 3-69 bp of exon ends where ESEs generally reside, this being 13% more than expected. Overall from enrichment of pathogenic SNPs at exon ends, we estimate that circa 20-45% of SNPs affect splicing. Importantly, we find that within genes pathogenic SNPs tend to occur in splicing-relevant regions with low ESE density: they are found to occur preferentially in the terminal half of genes, in exons flanked by short introns and at the ends of phase (0,0) exons with 3’ non-“AGgt” splice site. We suggest the concept of the “fragile” exon, one home to pathogenic SNPs owing to its vulnerability to splice disruption owing to low ESE density.
Original languageEnglish
Pages (from-to)518-529
JournalMolecular Biology and Evolution
Volume33
Issue number2
Early online date5 Nov 2015
DOIs
Publication statusPublished - Feb 2016

Fingerprint

Dive into the research topics of 'Determinants of the usage of splice-associated cis-motifs predict the distribution of human pathogenic SNPs'. Together they form a unique fingerprint.

Cite this