Abstract
Analysing the flanking sequences surrounding genes of interest is often highly relevant to understanding the role of mobile genetic elements (MGEs) in horizontal gene transfer, particular for antimicrobial-resistance genes. Here, we present Flanker, a Python package that performs alignment-free clustering of gene flanking sequences in a consistent format, allowing investigation of MGEs without prior knowledge of their structure. These clusters, known as ‘flank patterns’ (FPs), are based on Mash distances, allowing for easy comparison of similarity across sequences. Additionally, Flanker can be flexibly parameterized to fine-tune outputs by characterizing upstream and downstream regions separately, and investigating variable lengths of flanking sequence. We apply Flanker to two recent datasets describing plasmid-associated carriage of important carbapenemase genes (blaOXA-48 and blaKPC-2/3) and show that it successfully identifies distinct clusters of FPs, including both known and previously uncharacterized structural variants. For example, Flanker identified four Tn4401 profiles that could not be sufficiently characterized using TETyper or MobileElementFinder, demonstrating the utility of Flanker for flanking-gene characterization. Similarly, using a large (n=226) European isolate dataset, we confirm findings from a previous smaller study demonstrating association between Tn1999.2 and blaOXA-48 upregulation and demonstrate 17 FPs (compared to the 5 previously identified). More generally, the demonstration in this study that FPs are associated with geographical regions and antibiotic-susceptibility phenotypes suggests that they may be useful as epidemiological markers. Flanker is freely available under an MIT license at https://github.com/wtmatlock/flanker.
| Original language | English |
|---|---|
| Article number | 000634 |
| Pages (from-to) | 000634 |
| Number of pages | 8 |
| Journal | Microbial Genomics |
| Volume | 7 |
| Issue number | 9 |
| Early online date | 24 Sept 2021 |
| DOIs | |
| Publication status | Published - 24 Sept 2021 |
Bibliographical note
Publisher Copyright:© 2021 The Authors.
Acknowledgements
The authors thank the EuSCAPE and Dutch CPE surveillance groups formaking their data publicly available.
Funding
W.M. is supported by a scholarship from the Medical Research Foundation National PhD Training Programme in Antimicrobial Resistance Research (MRF-145-0004-TPG-AVISO). S.L. is a Medical Research Council Clinical Research Training Fellow (MR/T001151/1). L.P.S. is a Sir Henry Wellcome Postdoctoral Fellow (220422/Z/20/Z). A.S.W. and T.E.A.P. are National Institute for Health Research (NIHR) Senior Investigators. The computational aspects of this research were funded by the NIHR Oxford Biomedical Research Centre with additional support from the Wellcome Trust Core Award grant number 203141/Z/16/Z. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR nor the Department of Health. The research was supported by the NIHR Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance (NIHR200915) at the University of Oxford in partnership with Public Health England (PHE) and by the Oxford NIHR Biomedical Research Centre.
Keywords
- antimicrobial resistance (AMR)
- bioinformatics
- mobile genetic element (MGE)
- plasmid
- whole-genome sequencing
ASJC Scopus subject areas
- Epidemiology
- Microbiology
- Molecular Biology
- Genetics