Abstract
Bacterial genetic diversity is often described solely using base-pair changes despite a wide variety of other mutation types likely being major contributors. Tandem duplication/amplifications are thought to be widespread among bacteria but due to their often-intractable size and instability, comprehensive studies of these mutations are rare. We define a methodology to investigate amplifications in bacterial genomes based on read depth of genome sequence data as a proxy for copy number. We demonstrate the approach with Bordetella pertussis, whose insertion sequence element-rich genome provides extensive scope for amplifications to occur. Analysis of data for 2430 B. pertussis isolates identified 272 putative amplifications, of which 94% were located at 11 hotspot loci. We demonstrate limited phylogenetic connection for the occurrence of amplifications, suggesting unstable and sporadic characteristics. Genome instability was further described in vitro using long-read sequencing via the Nanopore platform, which revealed that clonally derived laboratory cultures produced heterogenous populations rapidly. We extended this research to analyse a population of 1000 isolates of another important pathogen, Mycobacterium tuberculosis. We found 590 amplifications in M. tuberculosis, and like B. pertussis, these occurred primarily at hotspots. Genes amplified in B. pertussis include those involved in motility and respiration, whilst in M. tuberuclosis, functions included intracellular growth and regulation of virulence. Using publicly available short-read data we predicted previously unrecognized, large amplifications in B. pertussis and M. tuberculosis. This reveals the unrecognized and dynamic genetic diversity of B. pertussis and M. tuberculosis, highlighting the need for a more holistic understanding of bacterial genetics.
Original language | English |
---|---|
Article number | 000761 |
Pages (from-to) | 1-15 |
Number of pages | 15 |
Journal | Microbial Genomics |
Volume | 8 |
Issue number | 2 |
Early online date | 10 Feb 2022 |
DOIs | |
Publication status | Published - 31 Dec 2022 |
Bibliographical note
Funding Information:J.A. was funded by a studentship from the University of Bath and Public Health England.
Funding Information:
J.A. was funded by a studentship from the University of Bath and Public Health England. We thank Josh Quick and Nick Loman, Institute of Microbiology and Infection, School of Biosciences, University of Birmingham for technical assistance with long-read sequencing on the Nanopore platform. The data analysis performed here would not have been possible without access to the bioinformatics resource, CLIMB (developed by the MRC, grant number MR/L015080/1). This work was made possible through support from CDC’s Advanced Molecular Detection (AMD) programme. The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
Keywords
- B. pertussis
- amplifications
- duplications
- genetic diversity
- genome structure
ASJC Scopus subject areas
- Epidemiology
- Microbiology
- Molecular Biology
- Genetics