masterpieceexternal/gpt-oss-20b-moe-expert-power-traces-320k
收藏Hugging Face2026-03-06 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/masterpieceexternal/gpt-oss-20b-moe-expert-power-traces-320k
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- audio-classification
language:
- en
tags:
- side-channel
- power-traces
- chipwhisperer
- gpu
- moe
size_categories:
- 100K<n<1M
---
# GPT-OSS-20B MoE Expert Power Traces (320k, ChipWhisperer)
This dataset contains analog power traces captured with a ChipWhisperer Husky while running **forced single-expert MoE computations** derived from `openai/gpt-oss-20b` on an NVIDIA H100.
## What is recorded
Each trace corresponds to one capture trial where:
1. A fixed expert id is selected (`expert_00` ... `expert_31`).
2. A random hidden-state tensor is generated **once per trial**.
3. The selected expert computation is executed repeatedly inside one capture window (`expert_iters=12`).
4. ChipWhisperer records a ~10 ms analog trace from the power sensing setup.
Important: this is **not** a full unmodified model forward pass. It is a controlled harness for expert-identification side-channel experiments.
## Dataset layout
- `capture_meta.json`: capture configuration and metadata
- `traces/expert_XX/trial_YYYYYY.npy`: raw captured trace for a class/trial
Class count: 32 experts (`expert_00`..`expert_31`)
Samples per class: 10,000
Total traces: 320,000
## Trace format
- File type: NumPy `.npy`
- Array dtype: floating-point (captured analog samples)
- Typical duration: ~10 ms per trace
- Captures include repeated expert activity inside one window (12 repetitions)
## Baseline training recipe used in experiments
A common preprocessing/training setup used with this dataset:
- Baseline normalization from early-trace samples
- Resample trace to fixed feature length (e.g., 16,384)
- Add first-difference channel (`dx`)
- Train 1D CNN for 32-way expert classification
## Known caveats
- No pre-trigger idle segment in this capture run.
- Early samples may include launch/ramp transients depending on timing.
- Repetition within a trace means each sample is a composite of multiple expert invocations.
- GPU state drift (clock/thermal/cache) can introduce non-stationarity.
## Intended use
- Side-channel feasibility studies for MoE expert identification
- Feature engineering and leakage-localization experiments
- Benchmarking robust time-series classifiers under drift/jitter
## Ethical and security note
This dataset is released for defensive research and measurement methodology work. Do not use it to target systems without authorization.
## Included collection script
- `scripts/train_expert_classifier_multiclass.py`: script used to run capture/training workflows; this dataset was captured with its multiclass expert-trace capture path and corresponding arguments.
提供机构:
masterpieceexternal



