Applying rearrangement distances to enable plasmid epidemiology with pling

Daria Frolova, Leandro Lima, Leah Wendy Roberts, Leonard Bohnenkämper, Roland Wittler, Jens Stoye, Zamin Iqbal

Research output: Contribution to journalArticlepeer-review

Abstract

Plasmids are a key vector of antibiotic resistance, but the current bioinformatics toolkit is not well suited to tracking them. The rapid structural changes seen in plasmid genomes present considerable challenges to evolutionary and epidemiological analysis. Typical approaches are either low resolution (replicon typing) or use shared k-mer content to define a genetic distance. However, this distance can both overestimate plasmid relatedness by ignoring rearrangements, and underestimate by over-penalizing gene gain/loss. Therefore a model is needed which captures the key components of how plasmid genomes evolve structurally - through gene/block gain or loss, and rearrangement. A secondary requirement is to prevent promiscuous transposable elements (TEs) leading to over-clustering of unrelated plasmids. We choose the 'Double Cut and Join Indel' (DCJ-Indel) model, in which plasmids are studied at a coarse level, as a sequence of signed integers (representing genes or aligned blocks), and the distance between two plasmids is the minimum number of rearrangement events or indels needed to transform one into the other. We show how this gives much more meaningful distances between plasmids. We introduce a software workflow pling (https://github.com/iqbal-lab-org/pling), which uses the DCJ-Indel model, to calculate distances between plasmids and then cluster them. In our approach, we combine containment distances and DCJ-Indel distances to build a TE-aware plasmid network. We demonstrate superior performance and interpretability to other plasmid clustering tools on the 'Russian Doll' dataset and a hospital transmission dataset.

Original languageEnglish
JournalMicrobial Genomics
Volume10
Issue number10
Early online date14 Oct 2024
DOIs
Publication statusPublished - 31 Oct 2024

Data Availability Statement

All supporting data, code and protocols have been provided within the article or through supplementary data files. Ten supplementary
figures and two supplementary tables are available with the online version of this article

Keywords

  • plasmids
  • clustering
  • mobile genetic elements
  • rearrangements
  • transmission
  • whole genome analysis

ASJC Scopus subject areas

  • Epidemiology
  • Microbiology
  • Molecular Biology
  • Genetics

Fingerprint

Dive into the research topics of 'Applying rearrangement distances to enable plasmid epidemiology with pling'. Together they form a unique fingerprint.

Cite this