Abstract
The unprecedented number of Streptococcus pneumoniae (the pneumococcus) genomes sequenced in recent years has accelerated the discovery of novel serotypes and highlighted the genetic diversity both between and within each serotype. A novel serotype should demonstrate a distinct cps locus, capsular structure and serological profile. In only the past 4 years, nine new serotypes have been identified. Accurate and timely serotyping of pneumococcal isolates is key to understanding their global distribution, evolution and the response of the bacterial population to vaccination. However, current bioinformatics serotyping tools are infrequently updated and struggle to accommodate the rapid discovery of new serotypes in a timely manner. To address these limitations, we built a comprehensive and curated library (SeroBAnk) encompassing all known pneumococcal serotypes; this resource is presented as an atlas on a dedicated publicly accessible webpage (https://www.pneumogen.net/gps/#/serobank). Building upon this resource, we developed SeroBA(v2.0), a tool with an easy-to-update database that can accurately identify 102 of 107 known pneumococcal serotypes (except for serotypes 24B, 24C, 24F, 7D and 6H) and 18 genetic subtypes within serotypes 6A, 6B, 11A, 19A, 19F and 33F. We validated SeroBA(v2.0) on 26,306 genomes from the Global Pneumococcal Sequencing project, reference isolates and simulated reads derived from the reference genetic sequences of capsular polysaccharide biosynthetic (cps) locus. We showed that SeroBA(v2.0) can reliably detect the nine recently discovered serotypes. Additionally, we show that in silico serotypes inferred by SeroBA(v2.0) had high concordance with phenotypic serotypes determined by either Quellung or latex agglutination at the serotype level (88.9%; 15,945/17,933) and at the serogroup level (91.9%; 16,480/17,933). Finally, we propose a community-contribution-based approach to ensure that SeroBA(v2.0) is maintained and updated as novel serotypes continue to be discovered. The global community can submit putative novel serotypes through our public repository on GitHub (https://github.com/GlobalPneumoSeq/seroba/issues). The submitted putative novel serotypes will be curated based on the genetic sequence of the cps region, capsular structure and serological profile by people of relevant expertise in the field. SeroBA(v2.0) can be accessed at https://github.com/GlobalPneumoSeq/seroba.
| Original language | English |
|---|---|
| Article number | 001483 |
| Journal | Microbial Genomics |
| Volume | 11 |
| Issue number | 10 |
| Early online date | 17 Oct 2025 |
| DOIs | |
| Publication status | Published - 31 Oct 2025 |
Funding
This study was supported by the: Rebecca L. Cooper Medical Research Foundation (Award Rebecca Cooper Fellowship) Principal Award Recipient: SatzkeCatherine Bill and Melinda Gates Foundation (Award INV-003570) Principal Award Recipient: StephenD. Bentley Wellcome (Award 206194) Principal Award Recipient: StephenD. Bentley
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- genomic surveillance
- pneumococcus
- serotyping
- Streptococcus pneumoniae
- whole-genome sequencing
ASJC Scopus subject areas
- Epidemiology
- Microbiology
- Molecular Biology
- Genetics
Fingerprint
Dive into the research topics of 'SeroBA(v2.0) and SeroBAnk: a robust genome-based serotyping scheme and comprehensive atlas of capsular diversity in Streptococcus pneumoniae'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS