Humans regularly interact with speakers who vary in accent, voice, speech rate, and articulation. Despite this variability, listeners typically comprehend speech with remarkable ease. One explanation for this efficiency is that comprehension involves not only the bottom-up integration of incoming information, but also top-down predictions about upcoming input that are guided by contextual information and internal knowledge. Nevertheless, it remains unclear how detailed these predictions are and how speaker variability influences their generation. In Studies 1 and 2, we investigated whether comprehenders anticipate the phonological form of a predictable word by using behavioral and electroencephalography (EEG) methods. To do so, we capitalized on the fact that foreign-accented speakers typically make systematic phonological errors. In both studies, participants read sentence fragments followed by a final word spoken either by a native- or foreign-accented speaker. The spoken word could be predictable or not based on the context of the sentence. Crucially, the speaker's identity was either cued or not by an image of the face of the speaker. In Study 1, participants performed a lexical decision task on the spoken target. In Study 2, they judged whether the final word matched their expectations on some proportion of trials and with no time pressure, while we measured Event-Related Potentials (ERPs). We observed that cueing the speaker's identity was associated with both faster lexical decision times and a smaller negative amplitude (300-500 ms after word onset) for predictable words but not for unpredictable words. These results indicate that prediction relies on flexible and finely tuned processes capable of accommodating interindividual phonological variability, suggesting that lexical information is pre-activated at the phonological level. Study 3 aims to investigate how listeners cope with acoustic differences between native speakers and whether speaker variability, in turn, influences prediction during naturalistic speech processing. In this ongoing study, we collected EEG data while participants listened to narrative stories under two experimental conditions: in the Single-speaker condition, one speaker narrated the entire story; in the Multi-speaker condition, different speakers narrated different sections of the story. We plan to apply Temporal Response Function (TRF) analysis to model how the brain encodes stimulus features related to both speech perception and predictive processing across listening conditions. Overall, this work underscores the flexibility of the human brain in processing linguistic input. Our findings suggest that prediction operates as a flexible mechanism accommodating speaker variability, and point towards future investigations of its function in naturalistic speech.

Adapting Predictions: How Speaker Variability Shapes Prediction during Language Comprehension

SALA, MARCO
2026

Abstract

Humans regularly interact with speakers who vary in accent, voice, speech rate, and articulation. Despite this variability, listeners typically comprehend speech with remarkable ease. One explanation for this efficiency is that comprehension involves not only the bottom-up integration of incoming information, but also top-down predictions about upcoming input that are guided by contextual information and internal knowledge. Nevertheless, it remains unclear how detailed these predictions are and how speaker variability influences their generation. In Studies 1 and 2, we investigated whether comprehenders anticipate the phonological form of a predictable word by using behavioral and electroencephalography (EEG) methods. To do so, we capitalized on the fact that foreign-accented speakers typically make systematic phonological errors. In both studies, participants read sentence fragments followed by a final word spoken either by a native- or foreign-accented speaker. The spoken word could be predictable or not based on the context of the sentence. Crucially, the speaker's identity was either cued or not by an image of the face of the speaker. In Study 1, participants performed a lexical decision task on the spoken target. In Study 2, they judged whether the final word matched their expectations on some proportion of trials and with no time pressure, while we measured Event-Related Potentials (ERPs). We observed that cueing the speaker's identity was associated with both faster lexical decision times and a smaller negative amplitude (300-500 ms after word onset) for predictable words but not for unpredictable words. These results indicate that prediction relies on flexible and finely tuned processes capable of accommodating interindividual phonological variability, suggesting that lexical information is pre-activated at the phonological level. Study 3 aims to investigate how listeners cope with acoustic differences between native speakers and whether speaker variability, in turn, influences prediction during naturalistic speech processing. In this ongoing study, we collected EEG data while participants listened to narrative stories under two experimental conditions: in the Single-speaker condition, one speaker narrated the entire story; in the Multi-speaker condition, different speakers narrated different sections of the story. We plan to apply Temporal Response Function (TRF) analysis to model how the brain encodes stimulus features related to both speech perception and predictive processing across listening conditions. Overall, this work underscores the flexibility of the human brain in processing linguistic input. Our findings suggest that prediction operates as a flexible mechanism accommodating speaker variability, and point towards future investigations of its function in naturalistic speech.
23-mar-2026
Inglese
PERESSOTTI, FRANCESCA
Università degli studi di Padova
File in questo prodotto:
File Dimensione Formato  
tesi_Marco_Sala.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 5.3 MB
Formato Adobe PDF
5.3 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/362209
Il codice NBN di questa tesi è URN:NBN:IT:UNIPD-362209