ChineseEEG-2:An EEG Dataset for Multimodal Semantic Alignment and Neural Decoding during Reading and Listening
收藏DataCite Commons2025-11-20 更新2025-04-16 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=bd666609b2464a42a0503f1eb96524cc
下载链接
链接失效反馈官方服务:
资源简介:
Now-existing Electroencephalography (EEG) datasets are mainly based on English, which encounters difficulty when representing Chinese. While there have been EEG datasets related to linguistic stimuli, the existing resources are limited, and many faces the problem of teacher-forcing. In our future studies, we plan to promote a unified encoder of multi-modalities for semantic decoding, which suggests the need of more data support. To bridge this gap, we introduce ChineseEEG-2, a high-density EEG dataset that extends ChineseEEG containing both reading aloud and auditory listening tasks. As a unique multimodal EEG dataset featuring synchronized reading and listening tasks based on the same corpus, ChineseEEG-2 dataset enables the exploration of how the brain processes language across both visual and auditory modalities in the context of Chinese natural language. It offers valuable insights into multimodal semantic alignment, neural decoding, and the alignment between large language models and neural processes, contributing to the development of BCI systems for language decoding*Important Note: This version of the dataset is no longer maintained. For the latest data and updates, please refer to the official and maintained version at: https://doi.org/10.57760/sciencedb.CHNNeuro.00001
目前已有的脑电图(Electroencephalography, EEG)数据集主要以英语为基础,在表征中文语言时存在局限性。尽管已有与语言刺激相关的脑电图数据集,但现有资源十分有限,且多数存在教师强制(teacher-forcing)问题。在未来的研究中,我们计划推进用于语义解码的多模态统一编码器,这需要更多的数据支撑。为填补这一空白,我们推出了ChineseEEG-2:这是一款高密度脑电图数据集,在原有ChineseEEG的基础上新增了朗读与听觉聆听两类任务。作为一款独特的多模态脑电图数据集,其具备基于同一语料库的同步朗读与听觉聆听任务特性,能够支持探究在中文自然语言语境下,大脑如何处理视觉与听觉两种模态下的语言信息。该数据集为多模态语义对齐、神经解码以及大语言模型(Large Language Model, LLM)与神经活动的对齐研究提供了宝贵的研究视角,有助于面向语言解码的脑机接口(Brain-Computer Interface, BCI)系统的发展。重要提示:本版本数据集已不再维护。如需获取最新数据与更新内容,请访问官方维护版本:https://doi.org/10.57760/sciencedb.CHNNeuro.00001
提供机构:
Science Data Bank
创建时间:
2025-02-14
搜集汇总
数据集介绍

背景与挑战
背景概述
ChineseEEG-2是一个高密度EEG数据集,专注于中文自然语言处理中的多模态语义对齐和神经解码。它独特地结合了阅读和听觉任务,基于相同语料库,为研究跨视觉和听觉模态的语言处理提供了宝贵资源。
以上内容由遇见数据集搜集并总结生成



