Evaluating scientific theories as predictive models in language neuroscience
Abstract
Modern data-driven encoding models are highly effective at predicting brain responses to language stimuli. However, these models struggle to explain the underlying phenomena, i.e. what features of the stimulus drive the response? We present Question Answering encoding models, a method for converting qualitative theories of language selectivity into highly accurate, interpretable models of brain responses. QA encoding models annotate a language stimulus by using a large language model to answer yes-no questions corresponding to qualitative theories. A compact QA encoding model that uses only 35 questions outperforms existing baselines at predicting brain responses in both fMRI and ECoG data. The model weights also provide easily interpretable maps of language selectivity across cortex; these maps show quantitative agreement with meta-analyses of the existing literature and selectivity maps identified in a follow-up fMRI experiment. These results demonstrate that LLMs can bridge the widening gap between qualitative scientific theories and data-driven models.
Related articles
Related articles are currently not available for this article.