A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs

Charlotte Soneson, Michael I. Love, Rob Patro, Mohammed Shobbir Hussain, Dheeraj Malhotra, Mark D. Robinson

Research output: Contribution to journalArticle

Abstract

Most methods for statistical analysis of RNA-seq data take a matrix of abundance estimates for some type of genomic features as their input, and consequently the quality of any obtained results is directly dependent on the quality of these abundances. Here, we present the junction coverage compatibility (JCC) score, which provides a way to evaluate the reliability of transcript-level abundance estimates as well as the accuracy of transcript annotation catalogs. It works by comparing the observed number of reads spanning each annotated splice junction in a genomic region to the predicted number of junction-spanning reads, inferred from the estimated transcript abundances and the genomic coordinates of the corresponding annotated transcripts. We show that while most genes show good agreement between the observed and predicted junction coverages, there is a small set of genes that do not. Genes with poor agreement are found regardless of the method used to estimate transcript abundances, and the corresponding transcript abundances should be treated with care in any downstream analyses.
LanguageEnglish
Article numbere201800175
JournalLife Science Alliance
Volume2
Issue number1
Early online date17 Jan 2019
DOIs
StatusPublished - 1 Feb 2019

Cite this

A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs. / Soneson, Charlotte; Love, Michael I.; Patro, Rob; Hussain, Mohammed Shobbir; Malhotra, Dheeraj; Robinson, Mark D.

In: Life Science Alliance, Vol. 2, No. 1, e201800175, 01.02.2019.

Research output: Contribution to journalArticle

Soneson, Charlotte ; Love, Michael I. ; Patro, Rob ; Hussain, Mohammed Shobbir ; Malhotra, Dheeraj ; Robinson, Mark D. / A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs. In: Life Science Alliance. 2019 ; Vol. 2, No. 1.
@article{761e0da0f09a47a48dba37cd3e5b8315,
title = "A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs",
abstract = "Most methods for statistical analysis of RNA-seq data take a matrix of abundance estimates for some type of genomic features as their input, and consequently the quality of any obtained results is directly dependent on the quality of these abundances. Here, we present the junction coverage compatibility (JCC) score, which provides a way to evaluate the reliability of transcript-level abundance estimates as well as the accuracy of transcript annotation catalogs. It works by comparing the observed number of reads spanning each annotated splice junction in a genomic region to the predicted number of junction-spanning reads, inferred from the estimated transcript abundances and the genomic coordinates of the corresponding annotated transcripts. We show that while most genes show good agreement between the observed and predicted junction coverages, there is a small set of genes that do not. Genes with poor agreement are found regardless of the method used to estimate transcript abundances, and the corresponding transcript abundances should be treated with care in any downstream analyses.",
author = "Charlotte Soneson and Love, {Michael I.} and Rob Patro and Hussain, {Mohammed Shobbir} and Dheeraj Malhotra and Robinson, {Mark D.}",
note = "{\circledC} 2019 Soneson et al.",
year = "2019",
month = "2",
day = "1",
doi = "10.26508/lsa.201800175",
language = "English",
volume = "2",
journal = "Life Science Alliance",
issn = "2575-1077",
publisher = "Life Science Alliance",
number = "1",

}

TY - JOUR

T1 - A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs

AU - Soneson, Charlotte

AU - Love, Michael I.

AU - Patro, Rob

AU - Hussain, Mohammed Shobbir

AU - Malhotra, Dheeraj

AU - Robinson, Mark D.

N1 - © 2019 Soneson et al.

PY - 2019/2/1

Y1 - 2019/2/1

N2 - Most methods for statistical analysis of RNA-seq data take a matrix of abundance estimates for some type of genomic features as their input, and consequently the quality of any obtained results is directly dependent on the quality of these abundances. Here, we present the junction coverage compatibility (JCC) score, which provides a way to evaluate the reliability of transcript-level abundance estimates as well as the accuracy of transcript annotation catalogs. It works by comparing the observed number of reads spanning each annotated splice junction in a genomic region to the predicted number of junction-spanning reads, inferred from the estimated transcript abundances and the genomic coordinates of the corresponding annotated transcripts. We show that while most genes show good agreement between the observed and predicted junction coverages, there is a small set of genes that do not. Genes with poor agreement are found regardless of the method used to estimate transcript abundances, and the corresponding transcript abundances should be treated with care in any downstream analyses.

AB - Most methods for statistical analysis of RNA-seq data take a matrix of abundance estimates for some type of genomic features as their input, and consequently the quality of any obtained results is directly dependent on the quality of these abundances. Here, we present the junction coverage compatibility (JCC) score, which provides a way to evaluate the reliability of transcript-level abundance estimates as well as the accuracy of transcript annotation catalogs. It works by comparing the observed number of reads spanning each annotated splice junction in a genomic region to the predicted number of junction-spanning reads, inferred from the estimated transcript abundances and the genomic coordinates of the corresponding annotated transcripts. We show that while most genes show good agreement between the observed and predicted junction coverages, there is a small set of genes that do not. Genes with poor agreement are found regardless of the method used to estimate transcript abundances, and the corresponding transcript abundances should be treated with care in any downstream analyses.

UR - http://www.life-science-alliance.org/content/2/1/e201800175

U2 - 10.26508/lsa.201800175

DO - 10.26508/lsa.201800175

M3 - Article

VL - 2

JO - Life Science Alliance

T2 - Life Science Alliance

JF - Life Science Alliance

SN - 2575-1077

IS - 1

M1 - e201800175

ER -