Intelligibility of Speech Corrupted by Nonlinear Distortion
-
2011/07/24
-
Details
-
Personal Author:
-
Description:Attempts to predict the intelligibility of speech transmitted by a communication system have led to numerous models (see, for example, ANSI 1997; IEC 2003; Christiansen et al. 2010; Elhilali et al. 2003; Kates & Arehart 2005; Payton & Braida 1999; Steeneken & Houtgast 2002a; Yu et al. 2010). Of these, the speech intelligibility index (SII) (ANSI 1997) and the speech transmission index (STI) (IEC 2003) have received most attention. Both provide an index of intelligibility from 0 to 1 based on the speech signal-to-noise ratio in discrete frequency bands. The frequency bands of the SII were originally chosen to reflect the psychoacoustic masking of test sounds by noise (critical bands). The method was later standardized with the speech spectrum alternatively broken down into fewer, broader frequency bands for convenience of calculation (one-third octave, and octave bands from 125 Hz to 8 kHz). The test signals are those naturally occurring in the communication system (i.e., speech and noise, the levels of which need to be separately determined). The STI focuses on the temporal modulation of speech sounds and adopted octave bands as the basis for calculating the modulation spectrum (Steeneken & Houtgast 2002b). It replaced speech by a probe signal to ensure that the modulation could be determined in each modulation frequency band, which have frequencies from 0.63 to 12.5 Hz in the international standard. In modern communication systems the speech signal is often corrupted by the signal processing and electronic circuitry, as well as by noise, which in some circumstances introduces audible distortion and may degrade intelligibility. The SII and STI have been shown to predict speech intelligibility for a range of conditions in which speech understanding is impeded by continuous noise, but fail when the speech signal is corrupted by nonlinear distortion such as center clipping. In these circumstances the performance of the SII has been improved by calculating the speech signal-to-'noise' (or distortion) ratio from the coherence, which needs to be determined for different amplitude ranges of the speech signal in order to assess the intelligibility (Kates & Arehart 2005). We have explored replacing the test signal of the STI by speech and adjusting the metric for the coherence between the original and corrupted speech, as a means for determining when the observed modulations are due to speech rather than 'noise' (Payton & Braida 1999; Goldsworthy & Greenberg 2004). Also, the contributions to intelligibility from speech information in nearby frequency bands is known not to be independent, and so cannot be simply summed as in some models (e.g., ANSI 1997), resulting in the need to estimate inter-band redundancy (Steeneken & Houtgast 1999; Brammer et al. 2010). In this paper we briefly describe our models and their application to speech-spectrum shaped noise and center clipping. The latter occurs when a signal within a communication channel rapidly changes polarity from a non-zero value. An example is given in Figure 1, where the time history of a short segment (0.1 s) of a speech sound is shown (above) as well as a corrupted waveform in which 75 % of the amplitude distribution of the speech sounds has been removed (below). [Description provided by NIOSH]
-
Subjects:
-
Keywords:
-
Publisher:
-
Document Type:
-
Funding:
-
Genre:
-
Place as Subject:
-
CIO:
-
Topic:
-
Location:
-
Pages in Document:287-292
-
NIOSHTIC Number:nn:20058846
-
Citation:10th International Congress on Noise as a Public Health Problem, July 24-28, 2011, London. Milton Keynes, United Kingdom: Institute of Acoustics, 2011 Jul; :287-292
-
Contact Point Address:A.J. Brammer, Ergonomic Technology Center, University of Connecticut Health Center, 263, Farmington Ave., Farmington CT 06030, USA
-
Email:brammer@uchc.edu
-
Federal Fiscal Year:2011
-
NORA Priority Area:
-
Performing Organization:University of Connecticut School of Medicine and Denistry, Farmington, Connecticut
-
Peer Reviewed:False
-
Start Date:20060801
-
Source Full Name:10th International Congress on Noise as a Public Health Problem, July 24-28, 2011, London
-
End Date:20120731
-
Collection(s):
-
Main Document Checksum:urn:sha-512:8aef5cfa46df4d2213dca3545f5056abab5fe72e8c41c67e38e6f737701d5a9f81545c23c66d1f40cf4605f41478ad38ebd8b319d863d4d7d11a93bbc99fbb2a
-
Download URL:
-
File Type:
ON THIS PAGE
CDC STACKS serves as an archival repository of CDC-published products including
scientific findings,
journal articles, guidelines, recommendations, or other public health information authored or
co-authored by CDC or funded partners.
As a repository, CDC STACKS retains documents in their original published format to ensure public access to scientific information.
As a repository, CDC STACKS retains documents in their original published format to ensure public access to scientific information.
You May Also Like