Skip to main navigation Skip to search Skip to main content

SeroBA(v2.0) and SeroBAnk: a robust genome-based serotyping scheme and comprehensive atlas of capsular diversity in Streptococcus pneumoniae

Oliver Lorenz, Alannah C. King, Harry C.H. Hung, Feroze A. Ganaie, Anne L. Wyllie, Sam Manna, Catherine Satzke, Mark van der Linden, Neil Ravenscroft, Hans Christian Slotved, Lesley McGee, Moon H. Nahm, Stephen D. Bentley, Stephanie W. Lo

Research output: Contribution to journalArticlepeer-review

2   Link opens in a new tab Citations (SciVal)

Abstract

The unprecedented number of Streptococcus pneumoniae (the pneumococcus) genomes sequenced in recent years has accelerated the discovery of novel serotypes and highlighted the genetic diversity both between and within each serotype. A novel serotype should demonstrate a distinct cps locus, capsular structure and serological profile. In only the past 4 years, nine new serotypes have been identified. Accurate and timely serotyping of pneumococcal isolates is key to understanding their global distribution, evolution and the response of the bacterial population to vaccination. However, current bioinformatics serotyping tools are infrequently updated and struggle to accommodate the rapid discovery of new serotypes in a timely manner. To address these limitations, we built a comprehensive and curated library (SeroBAnk) encompassing all known pneumococcal serotypes; this resource is presented as an atlas on a dedicated publicly accessible webpage (https://www.pneumogen.net/gps/#/serobank). Building upon this resource, we developed SeroBA(v2.0), a tool with an easy-to-update database that can accurately identify 102 of 107 known pneumococcal serotypes (except for serotypes 24B, 24C, 24F, 7D and 6H) and 18 genetic subtypes within serotypes 6A, 6B, 11A, 19A, 19F and 33F. We validated SeroBA(v2.0) on 26,306 genomes from the Global Pneumococcal Sequencing project, reference isolates and simulated reads derived from the reference genetic sequences of capsular polysaccharide biosynthetic (cps) locus. We showed that SeroBA(v2.0) can reliably detect the nine recently discovered serotypes. Additionally, we show that in silico serotypes inferred by SeroBA(v2.0) had high concordance with phenotypic serotypes determined by either Quellung or latex agglutination at the serotype level (88.9%; 15,945/17,933) and at the serogroup level (91.9%; 16,480/17,933). Finally, we propose a community-contribution-based approach to ensure that SeroBA(v2.0) is maintained and updated as novel serotypes continue to be discovered. The global community can submit putative novel serotypes through our public repository on GitHub (https://github.com/GlobalPneumoSeq/seroba/issues). The submitted putative novel serotypes will be curated based on the genetic sequence of the cps region, capsular structure and serological profile by people of relevant expertise in the field. SeroBA(v2.0) can be accessed at https://github.com/GlobalPneumoSeq/seroba.

Original languageEnglish
Article number001483
JournalMicrobial Genomics
Volume11
Issue number10
Early online date17 Oct 2025
DOIs
Publication statusPublished - 31 Oct 2025

Funding

This study was supported by the: Rebecca L. Cooper Medical Research Foundation (Award Rebecca Cooper Fellowship) Principal Award Recipient: SatzkeCatherine Bill and Melinda Gates Foundation (Award INV-003570) Principal Award Recipient: StephenD. Bentley Wellcome (Award 206194) Principal Award Recipient: StephenD. Bentley

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • genomic surveillance
  • pneumococcus
  • serotyping
  • Streptococcus pneumoniae
  • whole-genome sequencing

ASJC Scopus subject areas

  • Epidemiology
  • Microbiology
  • Molecular Biology
  • Genetics

Fingerprint

Dive into the research topics of 'SeroBA(v2.0) and SeroBAnk: a robust genome-based serotyping scheme and comprehensive atlas of capsular diversity in Streptococcus pneumoniae'. Together they form a unique fingerprint.

Cite this