Piggy:

A Rapid, Large-Scale Pan-Genome Analysis Tool for Intergenic Regions in Bacteria.

Harry Thorpe, Sion Bayliss, Samuel Sheppard, Edward Feil

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Background: The concept of the "pan-genome," which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences. Findings: To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ("switched") intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli. Conclusions: For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).

Original languageEnglish
Article numbergiy015
Pages (from-to)1-11
Number of pages11
JournalGigaScience
Volume7
Issue number4
Early online date4 Mar 2018
DOIs
Publication statusPublished - 1 Apr 2018

Fingerprint

Intergenic DNA
Bacteria
Genes
Genome
Staphylococcus aureus
Genomics
Pipelines
Alleles
Databases
Escherichia coli
Phenotype
Gene Expression
Gene expression
Proteins

Keywords

  • DNA, Intergenic
  • Escherichia coli/genetics
  • Genome, Bacterial
  • Genomics/methods
  • Staphylococcus aureus/genetics

Cite this

Piggy: A Rapid, Large-Scale Pan-Genome Analysis Tool for Intergenic Regions in Bacteria. / Thorpe, Harry; Bayliss, Sion; Sheppard, Samuel; Feil, Edward.

In: GigaScience, Vol. 7, No. 4, giy015, 01.04.2018, p. 1-11.

Research output: Contribution to journalArticle

@article{a45acb7aa3e046b0abf09791c27ef93e,
title = "Piggy:: A Rapid, Large-Scale Pan-Genome Analysis Tool for Intergenic Regions in Bacteria.",
abstract = "Background: The concept of the {"}pan-genome,{"} which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences. Findings: To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ({"}switched{"}) intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli. Conclusions: For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).",
keywords = "DNA, Intergenic, Escherichia coli/genetics, Genome, Bacterial, Genomics/methods, Staphylococcus aureus/genetics",
author = "Harry Thorpe and Sion Bayliss and Samuel Sheppard and Edward Feil",
year = "2018",
month = "4",
day = "1",
doi = "10.5524/100410",
language = "English",
volume = "7",
pages = "1--11",
journal = "GigaScience",
issn = "2047-217X",
publisher = "Springer",
number = "4",

}

TY - JOUR

T1 - Piggy:

T2 - A Rapid, Large-Scale Pan-Genome Analysis Tool for Intergenic Regions in Bacteria.

AU - Thorpe, Harry

AU - Bayliss, Sion

AU - Sheppard, Samuel

AU - Feil, Edward

PY - 2018/4/1

Y1 - 2018/4/1

N2 - Background: The concept of the "pan-genome," which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences. Findings: To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ("switched") intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli. Conclusions: For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).

AB - Background: The concept of the "pan-genome," which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences. Findings: To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ("switched") intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli. Conclusions: For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).

KW - DNA, Intergenic

KW - Escherichia coli/genetics

KW - Genome, Bacterial

KW - Genomics/methods

KW - Staphylococcus aureus/genetics

UR - http://www.scopus.com/inward/record.url?scp=85051575732&partnerID=8YFLogxK

U2 - 10.5524/100410

DO - 10.5524/100410

M3 - Article

VL - 7

SP - 1

EP - 11

JO - GigaScience

JF - GigaScience

SN - 2047-217X

IS - 4

M1 - giy015

ER -