Chinese researchers have presented a synchronized multimodal neuroimaging dataset covering almost 10,000 Chinese words for studying brain language processing, according to the Chinese Academy of Sciences (CAS) on Sunday.
The dataset sorted and processed functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) data on the same 12 healthy volunteers while the volunteers listened to six hours of naturalistic stories, as well as high-resolution structural, diffusion MRI, and resting-state fMRI data for each participant, the recently published research article reported in the journal Scientific Data.
The process of neuroimaging data collection. /Chinese Academy of Sciences
The researchers from the CAS Institute of Automation also provided rich linguistic annotations for the stimuli, including word frequencies, syntactic tree structures, time-aligned characters and words, and various types of word and character embeddings.
Such synchronized data is separately collected by the same group of participants first listening to story materials in fMRI and then in MEG, which are well suited to studying the dynamic processing of language comprehension, said the research article. In addition, the dataset, comprising a large vocabulary from stories with various topics, can serve as a brain benchmark to evaluate and improve computational language models.
When the brain processes language, it needs to mobilize neurons in multiple brain regions to work together in real-time.
The construction of neuroimaging data with high spatial and temporal resolution can help us better understand brain regions and is crucial for exploring the mechanism of language processing in the brain.
(Cover via CFP)