five

The Phantom EEG Dataset

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11238928
下载链接
链接失效反馈
官方服务:
资源简介:
When you use this dataset, please cite this paper. More information about this dataset could also be found in this paper. Xu, X., Wang, B., Xiao, B., Niu, Y., Wang, Y., Wu, X., & Chen, J. (2024). Beware of Overestimated Decoding Performance Arising from Temporal Autocorrelations in Electroencephalogram Signals. arXiv preprint arXiv:2405.17024. 1 Metadata Brief introduction The present work aims to demonstrate that temporal autocorrelations (TA) significantly impacts various BCI tasks even in conditions without neural activity. We used the watermelon as the phantom head and found that we could get the pitfall of overestimated decoding performance if continuous EEG data with the same class label were split into training and test sets. More details can be found in Motivation. As watermelons cannot perform any experimental tasks, we can reorganize it to the format of various actual EEG dataset without the need to collect EEG data as previous work did (examples in Domain Studied). Measurement devices Manufacturers: NeuroScan SynAmps2 system (Compumedics Limited, Victoria, Australia) Configuration: 64-channel Ag/AgCl electrode cap with a 10/20 layout Species Watermelons. Ten watermelons served as phantom heads. Domain Studied Overestimated Decoding Performance in EEG decoding. Following BCI datasets in various BCI tasks have been reorganized using the Phantom EEG Dataset. The pitfall has been found in four of five tasks. -        CVPR dataset [1] for image decoding task. -        DEAP dataset [2] for emotion recognition task. -        KUL dataset [3] for auditory spatial attention decoding task. -        BCIIV2a dataset [4] for motor imagery task (the pitfalls were absent due to the use of rapid-design paradigm during EEG recording). -        SIENA dataset [5] for epilepsy detection task. Tasks Completed Resting State but you could reorganize it to any task in BCI. Dataset Name The Phantom EEG Dataset Dataset license Creative Commons Attribution 4.0 International Code Your could get the code to read the data files (.cnt or .set) in the “code” folder. To run the codes, you should install the mne and numpy package. You could install via pip pip install mne==1.3.1 pip install numpy Then, you could use “BID2WMCVPR.py” to convert the BID dataset to the WM-CVPR dataset. You could also use “CNTK2WMCVPR.py” to convert the CNT dataset to the WM-CVPR dataset. The codes to reorganize other datasets other than CVPR [1] will be released on github after reviewing. Data information -        CNT: the raw data. Each Subject (S*.cnt) contains the following information: EEG.data: EEG data (samples X channels) EEG.srate: Sampling frequency of the saved data EEG.chanlocs : channel numbers (1 to 68, ‘EKG’ ‘EMG’ 'VEO' 'HEO' were not recorded)   -        BIDS: an extension to the brain imaging data structure for electroencephalography. BIDS primarily addresses the heterogeneity of data organization by following the FAIR principles [6]. Each Subject (sub-S*/eeg/) contains the following information: sub-S*_task-RestingState_channels.tsv: channel numbers (1 to 68, ‘EKG’ ‘EMG’ 'VEO' 'HEO' were not recorded) sub-S*_task-RestingState_eeg.json: Some information about the dataset. sub-S*_task-RestingState_eeg.set: EEG data (samples X channels) sub-S*_task-RestingState_events.tsv: the event during recording. We organized events using block-design and rapid-event-design. However, it is important to note that this does not need to be considered in any subsequent data reorganization, as watermelons cannot follow any experimental instructions.   -        code: more information on Code.   -        readme.md: the information about the dataset. Recordings An additional electrode was placed on the lower part of the watermelon as the physiological reference, and the forehead served as the ground site. The inter-electrode impedances were maintained under 20 kOhm. Data were recorded at a sampling rate of 1000 Hz. EEG recordings for each watermelon lasted for more than 1 hour to ensure sufficient data for the decoding task.   Citation and more information Citation will be updated after the review period is completed. We will provide more information about this dataset (e.g. the units of the captured data) once our work is accepted. This is because our work is currently under review, and we are not allowed to disclose more information according to the relevant requirements. All metadata will be provided as a backup on Github and will be available after the review period is completed.   2 Motivation Researchers have reported high decoding accuracy (>95%) using non-invasive Electroencephalogram (EEG) signals for brain-computer interface (BCI) decoding tasks like image decoding, emotion recognition, auditory spatial attention detection, epilepsy detection, etc. Since these EEG data were usually collected with well-designed paradigms in labs, the reliability and robustness of the corresponding decoding methods were doubted by some researchers, and they proposed that such decoding accuracy was overestimated due to the inherent temporal autocorrelations (TA) of EEG signals [7]–[9]. However, the coupling between the stimulus-driven neural responses and the EEG temporal autocorrelations makes it difficult to confirm whether this overestimation exists in truth. Some researchers also argue that the effect of TA in EEG data on decoding is negligible and that it becomes a significant problem only under specific experimental designs in which subjects do not have enough resting time [10], [11]. Due to a lack of problem formulation previous studies [7]–[9] only proposed that block-design should not be used to avoid the pitfall. However, the impact of TA could be avoided only when the trial of EEG was not further segmented into several samples. Otherwise, the overfitting or pitfall would still occur. In contrast, when the correct data splitting strategy was used (e.g. separating training and test data in time), the pitfall could also be avoided even when block-design was used. In our framework, we proposed the concept of "domain" to represent the EEG patterns resulting from TA and then used phantom EEG to remove stimulus-driven neural responses for verification. The results confirmed that the TA, always existing in the EEG data, added unique domain features to a continuous segment of EEG. The specific finding is that when the segment of EEG data with the same class label is split into multiple samples, the classifier will associate the sample's class label with the domain features, interfering with the learning of class-related features. This leads to an overestimation of decoding performance for test samples from the domains seen during training, and results in poor accuracy for test samples from unseen domains (as in real-world applications). Importantly, our work suggests that the key to reducing the impact of EEG TA on BCI decoding is to decouple class-related features from domain features in the actual EEG dataset. Our proposed unified framework serves as a reminder to BCI researchers of the impact of TA on their specific BCI tasks and is intended to guide them in selecting the appropriate experimental design, splitting strategy and model construction. 3 The rationality for using watermelon as the phantom head We must point out that the "phantom EEG" indeed does not contain any "EEG" but records only noise, a watermelon is not a brain and does not generate any electrical signals. Therefore, the recorded electrical noises, even when amplified using equipment typically used for EEG, do not constitute EEG data when considering the definition of EEG. This is why previous researchers called it "phantom EEG". Some researchers may therefore think that it is questionable to use watermelon to get the phantom EEG. However, the usage of the phantom head allows researchers to evaluate the performance of neural-recording equipment and proposed algorithms without the effects of neural activity variability, artifacts, and potential ethical issues. Phantom heads used in previous studies include digital models [12]–[14], real human skulls [15]–[17], artificial physical phantoms [18]–[24] and watermelons [25]–[40]. Due to their similar conductivity to human tissue, similar size and shape to the human head, and ease of acquisition, watermelons are widely used as "phantom heads". Most works tried to use watermelon as a phantom head and found that the results analyzed using the neural signals from human subjects could not be obtained when using the phantom head, thus proving that the achieved results were indeed caused by neural signals. For example, Mutanen et.al [35] proposed that “the fact that the phantom head stimulation did not evoke similar biphasic artifacts excludes the possibility that residual induced artifacts, with the current TMS-compatible EEG system, could explain these components”. Our work differs significantly from most previous works. It is firstly found in our work that the phantom EEG exhibits the effect of TA on BCI decoding even when only noise was recorded, indicating the inherent existence of TA in the EEG data. The conclusion we hope to draw is that some current works may not truly use stimulus-driven neural responses to obtain the overestimated decoding performance. Similar logic may be found in a neuroscience review article [41], they proposed that EEG recordings from phantom head (watermelon) remind us that background noise may appear as positive results without proper statistical precautions.     Reference [1]   C. Spampinato, S. Palazzo, I. Kavasidis, D. Giordano, N. Souly, and M. Shah, “Deep Learning Human Mind for Automated Visual Classification,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017, pp. 4503–4511. [2]   S. Koelstra et al., “DEAP: A Database for Emotion Analysis ;Using Physiological Signals,” IEEE Transactions on Affective Computing, vol. 3, no. 1, pp. 18–31, 2012. [3]   N. Das, T. Francart, and A. Bertrand, “Auditory Attention Detection Dataset KULeuven.” Zenodo, Aug. 27, 2020. [4]   M. Tangermann et al., “Review of the BCI Competition IV,” Front. Neurosci., vol. 6, 2012. [5]   P. Detti, G. Vatti, and G. Zabalo Manrique de Lara, “EEG Synchronization Analysis for Seizure Prediction: A Study on Data of Noninvasive Recordings,” Processes, vol. 8, no. 7, Art. no. 7, 2020. [6]   C. R. Pernet et al., “EEG-BIDS, an extension to the brain imaging data structure for electroencephalography,” Sci Data, vol. 6, no. 1, p. 103, 2019. [7]   R. Li et al., “The Perils and Pitfalls of Block Design for EEG Classification Experiments,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 1, pp. 316–333, 2021. [8]   H. Ahmed, R. B. Wilbur, H. M. Bharadwaj, and J. M. Siskind, “Object classification from randomized EEG trials,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2021, pp. 3844–3853. [9]   H. M. Bharadwaj, R. B. Wilbur, and J. M. Siskind, “Still an Ineffective Method With Supertrials/ERPs—Comments on ‘Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features,’” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 11, pp. 14052–14054, 2023. [10] S. Palazzo, C. Spampinato, I. Kavasidis, D. Giordano, J. Schmidt, and M. Shah, “The effects of experiment duration and supertrial analysis on EEG classification methods,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–3, 2024. [11] S. Palazzo, C. Spampinato, J. Schmidt, I. Kavasidis, D. Giordano, and M. Shah, “Correct block-design experiments mitigate temporal correlation bias in EEG classification.” arXiv, Nov. 25, 2020. [12] C. H. Wolters, A. Anwander, X. Tricoche, D. Weinstein, M. A. Koch, and R. S. MacLeod, “Influence of tissue conductivity anisotropy on EEG/MEG field and return current computation in a realistic head model: A simulation and visualization study using high-resolution finite element modeling,” NeuroImage, vol. 30, no. 3, pp. 813–826, 2006. [13] D. L. Collins et al., “Design and construction of a realistic digital brain phantom,” IEEE Trans. Med. Imaging, vol. 17, no. 3, pp. 463–468, 1998. [14] K. Miller, K. Chinzei, G. Orssengo, and P. Bednarz, “Mechanical properties of brain tissue in-vivo: experiment and computer simulation,” Journal of Biomechanics, vol. 33, no. 11, pp. 1369–1376, 2000. [15] S. Baillet, J. J. Riera, G. Marin, J. F. Mangin, J. Aubert, and L. Garnero, “Evaluation of inverse methods and head models for EEG source localization using a human skull phantom,” Phys. Med. Biol., vol. 46, no. 1, p. 77, 2001. [16] L. Gavit, S. Baillet, J.-F. Mangin, J. Pescatore, and L. Garnero, “A multiresolution framework to MEG/EEG source imaging,” IEEE Trans. Biomed. Eng., vol. 48, no. 10, pp. 1080–1087, 2001. [17] R. M. Leahy, J. C. Mosher, M. E. Spencer, M. X. Huang, and J. D. Lewine, “A study of dipole localization accuracy for MEG and EEG using a human skull phantom,” Electroencephalography and Clinical Neurophysiology, vol. 107, no. 2, pp. 159–173, 1998. [18] T. J. Collier, D. B. Kynor, J. Bieszczad, W. E. Audette, E. J. Kobylarz, and S. G. Diamond, “Creation of a Human Head Phantom for Testing of Electroencephalography Equipment and Techniques,” IEEE Trans. Biomed. Eng., vol. 59, no. 9, pp. 2628–2634, 2012. [19] R. J. Cooper, R. Eames, J. Brunker, L. C. Enfield, A. P. Gibson, and J. C. Hebden, “A tissue equivalent phantom for simultaneous near-infrared optical tomography and EEG,” Biomed. Opt. Express, vol. 1, no. 2, p. 425, 2010. [20] C. K. Looi and Z. N. Chen, “Design of a human-head-equivalent phantom for ISM 2.4-GHz applications,” Microw. Opt. Technol. Lett., vol. 47, no. 2, pp. 163–166, 2005. [21] A. S. Oliveira, B. R. Schlink, W. D. Hairston, P. König, and D. P. Ferris, “Induction and separation of motion artifacts in EEG data using a mobile phantom head device,” J. Neural Eng., vol. 13, no. 3, p. 036014, 2016. [22] J. R. Rice, R. H. Milbrandt, E. L. Madsen, G. R. Frank, E. J. Boote, and J. C. Blechinger, “Anthropomorphic 1H MRS head phantom,” Med. Phys., vol. 25, no. 7, pp. 1145–1156, 1998. [23] A. J. Riordan, M. Prokop, M. A. Viergever, J. W. Dankbaar, E. J. Smit, and H. W. A. M. De Jong, “Validation of CT brain perfusion methods using a realistic dynamic head phantom: Digital dynamic head phantom for CT brain perfusion,” Med. Phys., vol. 38, no. 6Part1, pp. 3212–3221, 2011. [24] K. Shmueli, D. L. Thomas, and R. J. Ordidge, “Design, construction and evaluation of an anthropomorphic head phantom with realistic susceptibility artifacts,” Magnetic Resonance Imaging, vol. 26, no. 1, pp. 202–207, 2007. [25] I. Akhoun et al., “Speech auditory brainstem response (speech ABR) characteristics depending on recording conditions, and hearing status,” Journal of Neuroscience Methods, vol. 175, no. 2, pp. 196–205, 2008. [26] M. Balasubramanian, W. M. Wells, J. R. Ives, P. Britz, R. V. Mulkern, and D. B. Orbach, “RF Heating of Gold Cup and Conductive Plastic Electrodes during Simultaneous EEG and MRI,” The Neurodiagnostic Journal, vol. 57, no. 1, pp. 69–83, 2017. [27] V. V. M. Dattada, S. Sasidharan, A. Hojlund, and K. S. Sridharan, “How Does Deep Brain Stimulation Affect Magnetoencephalography Data?,” in 2021 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER), Nitte, India, Nov. 2021, pp. 307–312. [28] M. K. Egan, R. Larsen, J. Wirsich, B. P. Sutton, and S. Sadaghiani, “Safety and data quality of EEG recorded simultaneously with multi-band fMRI,” PLOS ONE, vol. 16, no. 7, p. e0238485, 2021. [29] T. Eggert, H. Dorn, C. Sauter, G. Schmid, and H. Danker-Hopfe, “RF-EMF exposure effects on sleep – Age doesn’t matter in men!,” Environmental Research, vol. 191, p. 110173, 2020. [30] D. Freche, J. Naim-Feil, A. Peled, N. Levit-Binnun, and E. Moses, “A quantitative physical model of the TMS-induced discharge artifacts in EEG,” PLoS Comput Biol, vol. 14, no. 7, p. e1006177, 2018. [31] F. Kruggel, C. j. Wiggins, C. s. Herrmann, and D. y. von Cramon, “Recording of the event-related potentials during functional MRI at 3.0 Tesla field strength,” Magnetic Resonance in Medicine, vol. 44, no. 2, pp. 277–282, 2000. [32] H. Mandelkow et al., “Heart beats brain: The problem of detecting alpha waves by neuronal current imaging in joint EEG–MRI experiments,” NeuroImage, vol. 37, no. 1, pp. 149–163, 2007. [33] M. I. Miga, T. K. Sinha, D. M. Cash, R. L. Galloway, and R. J. Weil, “Cortical surface registration for image-guided neurosurgery using laser-range scanning,” IEEE Transactions on Medical Imaging, vol. 22, no. 8, pp. 973–985, 2003. [34] J. Modolo, M. Hassan, G. Ruffini, and A. Legros, “Probing the circuits of conscious perception with magnetophosphenes,” J. Neural Eng., vol. 17, no. 3, p. 036034, 2020. [35] T. Mutanen, H. Mäki, and R. J. Ilmoniemi, “The Effect of Stimulus Parameters on TMS–EEG Muscle Artifacts,” Brain Stimulation, vol. 6, no. 3, pp. 371–376, 2013. [36] J. Peeters et al., “Current Steering Using Multiple Independent Current Control Deep Brain Stimulation Technology Results in Distinct Neurophysiological Responses in Parkinson’s Disease Patients,” Front. Hum. Neurosci., vol. 16, 2022. [37] N. Perentos, R. J. Croft, R. J. McKenzie, and I. Cosic, “The Alpha Band of the Resting Electroencephalogram Under Pulsed and Continuous Radio Frequency Exposures,” IEEE Trans. Biomed. Eng., vol. 60, no. 6, pp. 1702–1710, 2013. [38] R. S. Schaefer, J. Farquhar, Y. Blokland, M. Sadakata, and P. Desain, “Name that tune: Decoding music from the listening brain,” NeuroImage, vol. 56, no. 2, pp. 843–849, 2011. [39] L. Sun and H. Hinrichs, “Simultaneously recorded EEG–fMRI: Removal of gradient artifacts by subtraction of head movement related average artifact waveforms,” Human Brain Mapping, vol. 30, no. 10, pp. 3361–3377, 2009. [40] J. N. van der Meer, Y. B. Eisma, R. Meester, M. Jacobs, and A. J. Nederveen, “Effects of mobile phone electromagnetic fields on brain waves in healthy volunteers,” Sci Rep, vol. 13, no. 1, p. 21758, 2023. [41] A. Peterson, D. Cruse, L. Naci, C. Weijer, and A. M. Owen, “Risk, diagnostic error, and the clinical science of consciousness,” NeuroImage: Clinical, vol. 7, pp. 588–597, 2015.
创建时间:
2024-10-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作