Adapting Whisper for Regional Dialects: Enhancing Public Services for Vulnerable Populations in the United Kingdom

Melissa Torgbi, Andrew Clayman, Jordan J. Speight, Harish Tayyar Madabushi

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

Abstract

We collect novel data in the public service domain to evaluate the capability of the state-of-the-art automatic speech recognition (ASR) models in capturing regional differences in accents in the United Kingdom (UK), specifically focusing on two accents from Scotland with distinct dialects. This study addresses real-world problems where biased ASR models can lead to miscommunication in public services, disadvantaging individuals with regional accents particularly those in vulnerable populations. We first examine the out-of-the-box performance of the Whisper large-v3 model on a baseline dataset and our data. We then explore the impact of fine-tuning Whisper on the performance in the two UK regions and investigate the effectiveness of existing model evaluation techniques for our real-world application through manual inspection of model errors. We observe that the Whisper model has a higher word error rate (WER) on our test datasets compared to the baseline data and fine-tuning on a given data improves performance on the test dataset with the same domain and accent. The fine-tuned models also appear to show improved performance when applied to the test data outside of the region it was trained on suggesting that fine-tuned models may be transferable within parts of the UK. Our manual analysis of model outputs reveals the benefits and drawbacks of using WER as an evaluation metric and fine-tuning to adapt to regional dialects.

Original languageEnglish
Title of host publicationVarDial 2025 - 12th Workshop on NLP for Similar Languages, Varieties and Dialects, Proceedings of the Workshop
EditorsYves Scherrer, Tommi Jauhiainen, Nikola Ljubesic, Preslav Nakov, Jorg Tiedemann, Marcos Zampieri
Place of PublicationNew York, U. S. A.
PublisherAssociation for Computational Linguistics (ACL)
Pages29-38
Number of pages10
ISBN (Electronic)9798891762084
Publication statusPublished - 19 Jan 2025
Event12th Workshop on NLP for Similar Languages, Varieties and Dialects, VarDial 2025 - co-located with the 31st International Conference on Computational Linguistics, COLING 2025 - Abu Dhabi, UAE United Arab Emirates
Duration: 19 Jan 2025 → …

Publication series

NameVarDial 2025 - 12th Workshop on NLP for Similar Languages, Varieties and Dialects, Proceedings of the Workshop

Conference

Conference12th Workshop on NLP for Similar Languages, Varieties and Dialects, VarDial 2025 - co-located with the 31st International Conference on Computational Linguistics, COLING 2025
Country/TerritoryUAE United Arab Emirates
CityAbu Dhabi
Period19/01/25 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint

Dive into the research topics of 'Adapting Whisper for Regional Dialects: Enhancing Public Services for Vulnerable Populations in the United Kingdom'. Together they form a unique fingerprint.

Cite this