LeBabyOx/EEGParquet
收藏Hugging Face2026-04-05 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/LeBabyOx/EEGParquet
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- tabular-classification
tags:
- eeg
- neuroscience
- biomedical-signal-processing
- time-series
- biology
pretty_name: EEGParquet-Benchmark
size_categories:
- 1M<n<10M
---
# 🧠 EEGParquet-Benchmark
## 📌 Overview
This dataset contains electroencephalography (EEG) recordings processed and stored in a structured format for machine learning and signal analysis tasks. It is designed to support research in brain-computer interfaces (BCI), neurological disorder detection, and time-series modeling. The dataset is stored in Parquet format to enable efficient large-scale processing and seamless integration with modern ML pipelines.
## 🎯 Intended Use
This dataset can be used for token classification on EEG sequences, brain signal decoding, sleep stage classification, seizure detection, time-series forecasting, and representation learning in biomedical signal processing.
## 🧾 Dataset Structure
The dataset is organized into multiple Parquet files, typically per subject or recording session:
```
/data
├── chb01_01_features.parquet
├── chb01_02_features.parquet
├── ...
```
Each file contains time-series EEG data with the following fields:
- `timestamp`: Time index of the signal
- `channel_*`: EEG channel values (e.g., Fp1, Fp2, etc.)
- `label` (optional): Annotation per timestep for supervised tasks
## 📊 Features
- timestamp: Time index of the signal
- channel_*: EEG electrode readings across multiple channels
- label: Token/class label (if available)
## 🧪 Processing Details
The dataset was constructed using a sliding window segmentation approach:
- Window size: 2 seconds
- Step size: 1 second (50% overlap)
- Sampling frequency: variable per recording (standardized during processing)
Each window is labeled as seizure (1) if it overlaps with annotated seizure intervals.
Extracted features per channel include:
- Statistical: Mean, Standard Deviation, Variance
- Information-theoretic: Shannon Entropy
- Spectral: Band power across Delta (0.5–4 Hz), Theta (4–8 Hz), Alpha (8–13 Hz), and Beta (13–30 Hz)
Power spectral density is computed using Welch’s method, with optional GPU acceleration for large-scale processing.
## 📈 Data Size
The dataset contains approximately 1M–10M timesteps and is optimized for fast I/O and scalable training workflows.
## 🧪 Example Usage
```python
import pandas as pd
df = pd.read_parquet("data/subject_01.parquet")
print(df.head())
```
## ⚠️ Limitations
EEG signals are inherently noisy and subject-dependent. Label quality may vary depending on the annotation source. This dataset is intended for research purposes and should not be used directly for clinical diagnosis without proper validation.
## 🔐 Ethics & Privacy
This dataset is intended for research and educational use only. Users are responsible for ensuring compliance with applicable regulations and ethical guidelines when using this data.
## 📚 Citation
If you use this dataset, please cite both the original dataset and this processed version:
### Original Dataset (CHB-MIT EEG)
```
@article{goldberger2000physiobank,
title={PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals},
author={Goldberger, Ary L. and Amaral, Luis A. N. and Glass, Leon and Hausdorff, Jeffrey M. and Ivanov, Plamen Ch. and Mark, Roger G. and Mietus, Joseph E. and Moody, George B. and Peng, Chung-Kang and Stanley, H. Eugene},
journal={Circulation},
volume={101},
number={23},
pages={e215--e220},
year={2000}
}
```
### This Dataset (EEGParquet-Benchmark)
```
@dataset{eegparquet_benchmark_2026,
title={EEGParquet-Benchmark: Windowed and Feature-Enriched EEG Dataset for Seizure Detection},
author={Daffa Tarigan},
year={2026},
note={Derived from the CHB-MIT Scalp EEG Database with 2-second sliding windows (1-second overlap), bandpass filtering (0.5--40 Hz), and statistical + spectral feature extraction},
publisher={Hugging Face},
url={https://huggingface.co/datasets/LeBabyOx/EEGParquet}
}
```
提供机构:
LeBabyOx



