ISIS and NISIS: New bilingual dual-channel speech corpora for robust speaker recognition

Amita Pal, Smarajit Bose, Mandar Mitra, Sandipan Roy

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

1 Citation (SciVal)

Abstract

It is standard practice to use benchmark datasets for comparing meaningfully the performance of a number of competing speaker identification systems. Generally, such datasets consist of speech recordings from different speakers made at a single point of time, typically in the same language. That is, the training and test sets both consist of speech recorded at the same point of time in the same language over the same recording channel. This is generally not the case in real-life applications. In this paper, we introduce a new database consisting of speech recordings of 105 speakers, made over four sessions, in two languages and simultaneously over two channels. This database provides scope for experimentation regarding loss in efficiency due to possible mismatch in language, channel and recording session. Results of experiments with MFCC-based GMM speaker models are presented to highlight the need of such benchmark datasets for identifying robust speaker identification systems.

Original languageEnglish
Title of host publicationProceedings of the 2012 International Conference on Image Processing, Computer Vision, and Pattern Recognition, IPCV 2012
Subtitle of host publicationVolume 2
Pages936-939
Number of pages4
Publication statusPublished - 1 Dec 2012
Event2012 International Conference on Image Processing, Computer Vision, and Pattern Recognition, IPCV 2012 - Las Vegas, NV, USA United States
Duration: 16 Jul 201219 Jul 2012

Conference

Conference2012 International Conference on Image Processing, Computer Vision, and Pattern Recognition, IPCV 2012
Country/TerritoryUSA United States
CityLas Vegas, NV
Period16/07/1219/07/12

Keywords

  • Classification accuracy
  • Gaussian mixture models
  • Mel frequency cepstral coefficients
  • Robust speaker recognition

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'ISIS and NISIS: New bilingual dual-channel speech corpora for robust speaker recognition'. Together they form a unique fingerprint.

Cite this