Publication Date

6-7-2021

Document Type

Article

Publication Title

Frontiers in Neuroscience

Editor

Josef P. Rauschecker

Volume

15

DOI

10.3389/fnins.2021.678029

Abstract

Speech perception often takes place in noisy environments, where multiple auditory signals compete with one another. The addition of visual cues such as talkers’ faces or lip movements to an auditory signal can help improve the intelligibility of speech in those suboptimal listening environments. This is referred to as audiovisual benefits. The current study aimed to delineate the signal-to-noise ratio (SNR) conditions under which visual presentations of the acoustic amplitude envelopes have their most significant impact on speech perception. Seventeen adults with normal hearing were recruited. Participants were presented with spoken sentences in babble noise either in auditory-only or auditory-visual conditions with various SNRs at −7, −5, −3, −1, and 1 dB. The visual stimulus applied in this study was a sphere that varied in size syncing with the amplitude envelope of the target speech signals. Participants were asked to transcribe the sentences they heard. Results showed that a significant improvement in accuracy in the auditory-visual condition versus the audio-only condition was obtained at the SNRs of −3 and −1 dB, but no improvement was observed in other SNRs. These results showed that dynamic temporal visual information can benefit speech perception in noise, and the optimal facilitative effects of visual amplitude envelope can be observed under an intermediate SNR range.

Funding Sponsor

College of Public Health & Health Professions of the University of Florida

Keywords

audiovisual speech recognition, multisensory gain, temporal coherence, amplitude envelope, SNR

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Department

Audiology

Share

COinS