This work is concerned with encoding shape descriptors for a succession of the waveform segments to enable the transmission of speech signals at a low data rate. The segmentation was dependent on the identification of waveform features in speech signals thereby producing an irregular data rate from the time encoding process. The shape descriptors have been related to the real and complex zeros of a waveform through the theory of zero-based signal representation. A study of factors governing the data rates, the speech intelligibility and the buffer delay has been made for the above coding process based on waveform segmentation at zero-crossings. The redundancy in the average information conveyed by the zero-crossing data was investigated from conditional probability measurements resulting in the conclusion that a significant reduction in the data was available from coding procedures utilising the correlation in the data sequence. Signal pre-emphasis and dynamic range were found to control the segmentation rate, the variations in segmentation rate during an utterance determining the buffer size and delay. The transmission rate and the system delay necessary for time encoding were strongly influenced by the distortion arising from buffer management in matching the variable information rate to a constant transmission rate. A reduction by approximately a third in the transmission rate was observed to introduce data underflow distortion at a 200ms system delay setting into approximately 5% of the speech. Finally, a performance assessment of the time encoding process was made, subjectively by a reduced form of Diagnostic Rhyme Test (DRT) and objectively by spectral density plots comparisons. The results have indicated a data rate less than that for delta modulation and a processing complexity less than that for vocoders.
|Date of Award||1981|