ESTOI for predicting the Intelligibility of speech

Nitesh Kumar; Karunavathi R K

doi:XX.XXX/IJARIIT-V4I4-1467

This paper is published in Volume-4, Issue-4, 2018

Paper Details
Abstract & PDF

Area

Speech Intelligibility

Author

Nitesh Kumar, Karunavathi R K

Org/Univ

Bangalore Institute of Technology, Bengaluru, Karnataka, India

Pub. Date

13 August, 2018

Paper ID

V4I4-1467

Publisher

IJARIIT

Edition

Volume-4, Issue-4, 2018

Keywords

Articulation index, Speech intelligibility index, Speech transmission index, Extended SII, Short time objective intelligibility, Extended STOI

Citations

IEEE
Nitesh Kumar, Karunavathi R K. ESTOI for predicting the Intelligibility of speech, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.

APA
Nitesh Kumar, Karunavathi R K (2018). ESTOI for predicting the Intelligibility of speech. International Journal of Advance Research, Ideas and Innovations in Technology, 4(4) www.IJARIIT.com.

MLA
Nitesh Kumar, Karunavathi R K. "ESTOI for predicting the Intelligibility of speech." International Journal of Advance Research, Ideas and Innovations in Technology 4.4 (2018). www.IJARIIT.com.

Give proper credits, use Citation.

Abstract

Intelligibility listening tests are necessary during development and evaluation of speech processing algorithms, despite the fact that they are expensive and time-consuming.The proposed scheme uses a monaural intelligibility prediction algorithm, which has the potential of replacing some of the listening tests. The proposed algorithm shows similarities to the Short-Time Objective Intelligibility (STOI) algorithm but works for a larger range of input signals. In contrast to STOI, Extended STOI (ESTOI) does not assume mutual independence between frequency bands. ESTOI also incorporates spectral correlation by comparing complete 400-ms length spectrograms of the noisy/processed speech and the clean speech signals. As a consequence, ESTOI is also able to accurately predict the intelligibility of speech contaminated by temporally highly modulated noise sources in addition to noisy signals processed with time-frequency weighting. We show that ESTOI can be interpreted in terms of an orthogonal decomposition of short-time spectrograms into intelligibility subspaces, i.e., a ranking of spectrogram features according to their importance to intelligibility.

All content is copyright protected.