Independent mechanisms of temporal and linguistic cue correspondence benefiting audiovisual speech processing

Sara Fiscella
Madeline S Cappelloni
Ross K Maddox

3 evaluations Published on Jul 31, 2020

This article on Sciety

Abstract

When listening is difficult, seeing the face of the talker aids speech comprehension. Faces carry both temporal (low-level physical correspondence of mouth movement and auditory speech) and linguistic (learned physical correspondences of mouth shape (viseme) and speech sound (phoneme)) cues. Listeners participated in two experiments investigating how these cues may be used to process sentences when maskers are present. In Experiment I, faces were rotated to disrupt linguistic but not temporal cue correspondence. Listeners suffered a deficit in speech comprehension when the faces were rotated, indicating that visemes are processed in a rotation-dependent manner, and that linguistic cues aid comprehension. In Experiment II, listeners were asked to detect pitch modulation in the target speech with upright and inverted faces that either matched the target or masker speech such that performance differences could be explained by binding, an early multisensory integration mechanism distinct from traditional late integration. Performance in this task replicated previous findings that temporal integration induces binding, but there was no behavioral evidence for a role of linguistic cues in binding. Together these experiments point to temporal cues providing a speech processing benefit through binding and linguistic cues providing a benefit through late integration.

Related articles are currently not available for this article.