Quantifying Source Speaker Leakage in One-to-One Voice Conversion

Scott Wellington, Xuechen Liu, Junichi Yamagishi

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

Abstract

Using a multi-accented corpus of parallel utterances for use with commercial speech devices, we present a case study to show that it is possible to quantify a degree of confidence about a source speaker's identity in the case of one-to-one voice conversion. Following voice conversion using a HiFi-GAN vocoder, we compare information leakage for a range speaker characteristics; assuming a 'worst-case' white-box scenario, we quantify our confidence to perform inference and narrow the pool of likely source speakers, reinforcing the regulatory obligation and moral duty that providers of synthetic voices have to ensure the privacy of their speakers' data.

Original languageEnglish
Title of host publicationBIOSIG 2024 - Proceedings of the 23rd International Conference of the Biometrics Special Interest Group
EditorsFadi Boutros, Naser Damer, Meiling Fang, Marta Gomez-Barrero, Kiran Raja, Christian Rathgeb, Ana F. Sequeira, Massimiliano Todisco
Place of PublicationU. S. A.
PublisherIEEE
Pages1-6
ISBN (Electronic)9798350373714
ISBN (Print)9798350373721
DOIs
Publication statusPublished - 11 Dec 2024
Event23rd International Conference of the Biometrics Special Interest Group, BIOSIG 2024 - Darmstadt, Germany
Duration: 25 Sept 202427 Sept 2024

Publication series

NameBIOSIG 2024 - Proceedings of the 23rd International Conference of the Biometrics Special Interest Group

Conference

Conference23rd International Conference of the Biometrics Special Interest Group, BIOSIG 2024
Country/TerritoryGermany
CityDarmstadt
Period25/09/2427/09/24

Keywords

  • evaluation
  • privacy
  • voice conversion

ASJC Scopus subject areas

  • Agricultural and Biological Sciences (miscellaneous)
  • Computer Science Applications
  • Instrumentation
  • Pathology and Forensic Medicine

Fingerprint

Dive into the research topics of 'Quantifying Source Speaker Leakage in One-to-One Voice Conversion'. Together they form a unique fingerprint.

Cite this