Genome-wide annotation of transcript boundaries using bacterial Rend-seq datasets

Research output: Contribution to journalArticlepeer-review

Abstract

Accurate annotation to single-nucleotide resolution of the transcribed regions in genomes is key to optimally analyse RNA-seq data, understand regulatory events and for the design of experiments. However, currently most genome annotations provided by GenBank generally lack information about untranslated regions. Additionally, information regarding genomic locations of non-coding RNAs, such as sRNAs, or anti-sense RNAs is frequently missing. To provide such information, diverse RNA-seq technologies, such as Rend-seq, have been developed and applied to many bacterial species. However, incorporating this vast amount of information into annotation files has been limited and is bioinformatically challenging, resulting in UTRs and other non-coding elements being overlooked or misrepresented. To overcome this problem, we present pyRAP (python Rend-seq Annotation Pipeline), a software package that analyses Rend-seq datasets to accurately resolve transcript boundaries genome-wide. We report the use of pyRAP to find novel transcripts, transcript isoforms, and RNase-dependent sRNA processing events. In Bacillus subtilis we uncovered 63 novel transcripts and provide genomic coordinates with single-nucleotide resolution for 2218 5'UTRs, 1864 3'UTRs and 161 non-coding RNAs. In Escherichia coli, we report 117 novel transcripts, 2429 5'UTRs, 1619 3'UTRs and 91 non-coding RNAs, and in Staphylococcus aureus, 16 novel transcripts, 664 5'UTRs, 696 3'UTRs, and 81 non-coding RNAs. Finally, we use pyRAP to produce updated annotation files for B. subtilis 168, E. coli K-12 MG1655, and S. aureus 8325 for use in the wider microbial genomics research community.

Original languageEnglish
Article number001239
JournalMicrobial Genomics
Volume10
Issue number4
DOIs
Publication statusPublished - 26 Apr 2024

Keywords

  • Bacillus subtilis
  • Escherichia coli
  • Rend-seq
  • Staphylococcus aureus
  • pyRAP
  • sRNA

ASJC Scopus subject areas

  • Genetics
  • Molecular Biology
  • Epidemiology
  • Microbiology

Cite this