In time encoded speech (TES), information is transmitted relating to the distances between zero crossings and the shape of the waveform between successive zero crossings. The quality of the reconstructed TES signal will therefore depend on the accuracy to which these original signal parameters are presented in the reconstructed signal. When transmitting the waveform parameter descriptors (symbols), the variable TES symbol generation rate has to be matched to constant rate transmission channels using first-in first-out storage buffers. Since there are large variations in generation rates, at modest transmission rates, these buffers overflow destroying some of the symbols. Therefore in practical TES systems, the description of the original signal parameters will also depend on the amount of buffer distortion introduced in the transmission process. In this thesis, two techniques of describing the waveshape more: accurately than existing TES methods, four methods of controlling buffer overflow, and the auditory effects of these waveshape describing the buffer overflow control techniques are presented. Using the two new waveshape describing techniques and a parabolic reconstruction techniques it is shown that to obtain a significant improvement in quality in high quality TES Systems, a substantial increase in precise original signal information is required. Ways of achieving this kind of increase in original signal information without significantly increasing the data rate, has been suggested and demonstrated. Using the four buffer control strategies it is shown that for the control strategies to operate satisfactorily, buffer overflow in the voiced regions should be avoided. It is then shown that this can be achieved without significantly increasing the transmission rate, by exploiting properties of speech perception. Further, various methods of quantising TES parameters and the tradeoffs between quantisation and buffer overflow distortion are also investigated.
|Date of Award