4gate/StemQAMixture
收藏Hugging Face2026-01-28 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/4gate/StemQAMixture
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: biology
data_files:
- split: train
path: data/biology-train-*.parquet
- split: validation
path: data/biology-validation-*.parquet
- split: test
path: data/biology-test-*.parquet
- config_name: chemistry
data_files:
- split: train
path: data/chemistry-train-*.parquet
- split: validation
path: data/chemistry-validation-*.parquet
- split: test
path: data/chemistry-test-*.parquet
- config_name: math
data_files:
- split: train
path: data/math-train-*.parquet
- split: validation
path: data/math-validation-*.parquet
- split: test
path: data/math-test-*.parquet
- config_name: physics
data_files:
- split: train
path: data/physics-train-*.parquet
- split: validation
path: data/physics-validation-*.parquet
- split: test
path: data/physics-test-*.parquet
---
# StemQAMixture
Science QA dataset with four subjects: biology, chemistry, math, and physics.
## Usage
```python
from datasets import load_dataset
# Load a specific subject
ds = load_dataset("4gate/StemQAMixture", "math", split="train")
# Load all subjects
ds_bio = load_dataset("4gate/StemQAMixture", "biology")
ds_chem = load_dataset("4gate/StemQAMixture", "chemistry")
```
## Schema
- `question`: The question text
- `answer`: The answer text
- `reference_answer`: Optional reference answer
- `answer_source`: Source of the answer (e.g., "gpt-4", "megascience", "numina")
- `reference_answer_source`: Source of the reference answer
- `dataset_source`: Original dataset ("camel-ai", "megascience", "numina")
- `subject`: Subject area ("biology", "chemistry", "math", "physics")
- `topic`: Optional topic within subject
- `subtopic`: Optional subtopic
- `metadata`: Optional JSON metadata
提供机构:
4gate



