| Original language | English |
|---|---|
| Journal | Nature Biotechnology |
| DOIs | |
| Publication status | Published - 10 Sept 2025 |
Data Availability Statement
All analyses were conducted with data public genome databases: the GTDB r214 complete dataset, GenBank + RefSeq dataset (downloaded on February 15, 2024) and AllTheBacteria version 0.2.Acknowledgements
This study was supported by grants from the National Natural Science Foundation of China (82341112 to W.S.), Chinese Scholarship Council scholarship (202308500105 to W.S.), EMBL Visitor/Sabbatical Program fellowship (to W.S.), Remarkable Innovation—Clinical Research Project (to W.S.), Joint Project of Pinnacle Disciplinary Group (to W.S.) and Kuanren Talents Program (to W.S.) of The Second Affiliated Hospital of Chongqing Medical University. We thank S. Wang (Peking University People’s Hospital), L. Roberts (Queensland University of Technology), S. Cai and L. Zhao (Chongqing Medical University) and R. Colquhoun (Edinburgh University) for using LexicMap and giving valuable feedback during the development. We thank D. Anderson for suggesting test datasets. We thank P. Wang (University of Montpellier) for comments on the paper and visualization. We thank D. Anderson, M. Hunt and D. Frolova for fruitful discussions.Funding
Open access funding provided by European Molecular Biology Laboratory (EMBL).