This paper is published in Volume-4, Issue-4, 2018
Area
Speech Intelligibility
Author
Nitesh Kumar, Karunavathi R K
Org/Univ
Bangalore Institute of Technology, Bengaluru, Karnataka, India
Keywords
Articulation index, Speech intelligibility index, Speech transmission index, Extended SII, Short time objective intelligibility, Extended STOI
Citations
IEEE
Nitesh Kumar, Karunavathi R K. ESTOI for predicting the Intelligibility of speech, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.
APA
Nitesh Kumar, Karunavathi R K (2018). ESTOI for predicting the Intelligibility of speech. International Journal of Advance Research, Ideas and Innovations in Technology, 4(4) www.IJARIIT.com.
MLA
Nitesh Kumar, Karunavathi R K. "ESTOI for predicting the Intelligibility of speech." International Journal of Advance Research, Ideas and Innovations in Technology 4.4 (2018). www.IJARIIT.com.
Nitesh Kumar, Karunavathi R K. ESTOI for predicting the Intelligibility of speech, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.
APA
Nitesh Kumar, Karunavathi R K (2018). ESTOI for predicting the Intelligibility of speech. International Journal of Advance Research, Ideas and Innovations in Technology, 4(4) www.IJARIIT.com.
MLA
Nitesh Kumar, Karunavathi R K. "ESTOI for predicting the Intelligibility of speech." International Journal of Advance Research, Ideas and Innovations in Technology 4.4 (2018). www.IJARIIT.com.
Abstract
Intelligibility listening tests are necessary during development and evaluation of speech processing algorithms, despite the fact that they are expensive and time-consuming.The proposed scheme uses a monaural intelligibility prediction algorithm, which has the potential of replacing some of the listening tests. The proposed algorithm shows similarities to the Short-Time Objective Intelligibility (STOI) algorithm but works for a larger range of input signals. In contrast to STOI, Extended STOI (ESTOI) does not assume mutual independence between frequency bands. ESTOI also incorporates spectral correlation by comparing complete 400-ms length spectrograms of the noisy/processed speech and the clean speech signals. As a consequence, ESTOI is also able to accurately predict the intelligibility of speech contaminated by temporally highly modulated noise sources in addition to noisy signals processed with time-frequency weighting. We show that ESTOI can be interpreted in terms of an orthogonal decomposition of short-time spectrograms into intelligibility subspaces, i.e., a ranking of spectrogram features according to their importance to intelligibility.