Determinants of the usage of splice-associated cis-motifs predict the distribution of human pathogenic SNPs

XianMing Wu, Laurence D. Hurst

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Where in genes do pathogenic mutations tend to occur and does this provide clues as to the possible underlying mechanisms by which single nucleotide polymorphisms (SNPs) cause disease? As splice-disrupting mutations tend to occur predominantly at exon ends, known also to be hot spots of cis-exonic splice control elements, we examine the relationship between the relative density of such exonic cis-motifs and pathogenic SNPs. In particular we focus on the intragene distribution of exonic splicing enhancers (ESE) and the covariance between them and disease-associated SNPs. In addition to showing that disease-causing genes tend to be genes with a high intron density, consistent with missplicing, five factors established as trends in ESE usage, are considered: relative position in exons, relative position in genes, flanking intron size, splice sites usage, and phase. We find that more than 76% of pathogenic SNPs are within 3-69 bp of exon ends where ESEs generally reside, this being 13% more than expected. Overall from enrichment of pathogenic SNPs at exon ends, we estimate that circa 20-45% of SNPs affect splicing. Importantly, we find that within genes pathogenic SNPs tend to occur in splicing-relevant regions with low ESE density: they are found to occur preferentially in the terminal half of genes, in exons flanked by short introns and at the ends of phase (0,0) exons with 3’ non-“AGgt” splice site. We suggest the concept of the “fragile” exon, one home to pathogenic SNPs owing to its vulnerability to splice disruption owing to low ESE density.
LanguageEnglish
Pages518-529
JournalMolecular Biology and Evolution
Volume33
Issue number2
Early online date5 Nov 2015
DOIs
StatusPublished - Feb 2016

Fingerprint

single nucleotide polymorphism
Single Nucleotide Polymorphism
polymorphism
exons
Exons
gene
Introns
introns
genes
Genes
mutation
Mutation
Specific Gravity
Gene Order
distribution
vulnerability

Cite this

Determinants of the usage of splice-associated cis-motifs predict the distribution of human pathogenic SNPs. / Wu, XianMing; Hurst, Laurence D.

In: Molecular Biology and Evolution, Vol. 33, No. 2, 02.2016, p. 518-529.

Research output: Contribution to journalArticle

@article{f9553dae340a46baa4e161ee466584f9,
title = "Determinants of the usage of splice-associated cis-motifs predict the distribution of human pathogenic SNPs",
abstract = "Where in genes do pathogenic mutations tend to occur and does this provide clues as to the possible underlying mechanisms by which single nucleotide polymorphisms (SNPs) cause disease? As splice-disrupting mutations tend to occur predominantly at exon ends, known also to be hot spots of cis-exonic splice control elements, we examine the relationship between the relative density of such exonic cis-motifs and pathogenic SNPs. In particular we focus on the intragene distribution of exonic splicing enhancers (ESE) and the covariance between them and disease-associated SNPs. In addition to showing that disease-causing genes tend to be genes with a high intron density, consistent with missplicing, five factors established as trends in ESE usage, are considered: relative position in exons, relative position in genes, flanking intron size, splice sites usage, and phase. We find that more than 76{\%} of pathogenic SNPs are within 3-69 bp of exon ends where ESEs generally reside, this being 13{\%} more than expected. Overall from enrichment of pathogenic SNPs at exon ends, we estimate that circa 20-45{\%} of SNPs affect splicing. Importantly, we find that within genes pathogenic SNPs tend to occur in splicing-relevant regions with low ESE density: they are found to occur preferentially in the terminal half of genes, in exons flanked by short introns and at the ends of phase (0,0) exons with 3’ non-“AGgt” splice site. We suggest the concept of the “fragile” exon, one home to pathogenic SNPs owing to its vulnerability to splice disruption owing to low ESE density.",
author = "XianMing Wu and Hurst, {Laurence D.}",
year = "2016",
month = "2",
doi = "10.1093/molbev/msv251",
language = "English",
volume = "33",
pages = "518--529",
journal = "Molecular Biology and Evolution",
issn = "0737-4038",
publisher = "Oxford University Press",
number = "2",

}

TY - JOUR

T1 - Determinants of the usage of splice-associated cis-motifs predict the distribution of human pathogenic SNPs

AU - Wu, XianMing

AU - Hurst, Laurence D.

PY - 2016/2

Y1 - 2016/2

N2 - Where in genes do pathogenic mutations tend to occur and does this provide clues as to the possible underlying mechanisms by which single nucleotide polymorphisms (SNPs) cause disease? As splice-disrupting mutations tend to occur predominantly at exon ends, known also to be hot spots of cis-exonic splice control elements, we examine the relationship between the relative density of such exonic cis-motifs and pathogenic SNPs. In particular we focus on the intragene distribution of exonic splicing enhancers (ESE) and the covariance between them and disease-associated SNPs. In addition to showing that disease-causing genes tend to be genes with a high intron density, consistent with missplicing, five factors established as trends in ESE usage, are considered: relative position in exons, relative position in genes, flanking intron size, splice sites usage, and phase. We find that more than 76% of pathogenic SNPs are within 3-69 bp of exon ends where ESEs generally reside, this being 13% more than expected. Overall from enrichment of pathogenic SNPs at exon ends, we estimate that circa 20-45% of SNPs affect splicing. Importantly, we find that within genes pathogenic SNPs tend to occur in splicing-relevant regions with low ESE density: they are found to occur preferentially in the terminal half of genes, in exons flanked by short introns and at the ends of phase (0,0) exons with 3’ non-“AGgt” splice site. We suggest the concept of the “fragile” exon, one home to pathogenic SNPs owing to its vulnerability to splice disruption owing to low ESE density.

AB - Where in genes do pathogenic mutations tend to occur and does this provide clues as to the possible underlying mechanisms by which single nucleotide polymorphisms (SNPs) cause disease? As splice-disrupting mutations tend to occur predominantly at exon ends, known also to be hot spots of cis-exonic splice control elements, we examine the relationship between the relative density of such exonic cis-motifs and pathogenic SNPs. In particular we focus on the intragene distribution of exonic splicing enhancers (ESE) and the covariance between them and disease-associated SNPs. In addition to showing that disease-causing genes tend to be genes with a high intron density, consistent with missplicing, five factors established as trends in ESE usage, are considered: relative position in exons, relative position in genes, flanking intron size, splice sites usage, and phase. We find that more than 76% of pathogenic SNPs are within 3-69 bp of exon ends where ESEs generally reside, this being 13% more than expected. Overall from enrichment of pathogenic SNPs at exon ends, we estimate that circa 20-45% of SNPs affect splicing. Importantly, we find that within genes pathogenic SNPs tend to occur in splicing-relevant regions with low ESE density: they are found to occur preferentially in the terminal half of genes, in exons flanked by short introns and at the ends of phase (0,0) exons with 3’ non-“AGgt” splice site. We suggest the concept of the “fragile” exon, one home to pathogenic SNPs owing to its vulnerability to splice disruption owing to low ESE density.

UR - http://dx.doi.org/10.1093/molbev/msv251

UR - http://dx.doi.org/10.1093/molbev/msv251

U2 - 10.1093/molbev/msv251

DO - 10.1093/molbev/msv251

M3 - Article

VL - 33

SP - 518

EP - 529

JO - Molecular Biology and Evolution

T2 - Molecular Biology and Evolution

JF - Molecular Biology and Evolution

SN - 0737-4038

IS - 2

ER -