Abhiram1009/synthetic-data-factory-5000-20260317
收藏Hugging Face2026-03-17 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Abhiram1009/synthetic-data-factory-5000-20260317
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-generation
- question-answering
language:
- en
pretty_name: Synthetic Data Factory 5k
size_categories:
- 1K<n<10K
---
# Synthetic Data Factory 5k
This dataset contains 5,000 synthetic math and logic examples generated without LLM-based generation.
## File
- `generated_5000.jsonl`: JSONL rows with problem text, explanation text, final answer, tags, metadata, and quality report.
## Families
- arithmetic expression evaluation
- linear equation solving
- comparison logic / transitive reasoning
## Generation approach
Examples are created through a world-model-first pipeline:
- latent structured problem specification
- exact symbolic or rule-based teacher
- controlled renderer for natural-language problems and explanations
- validation, deduplication, and quality gating
提供机构:
Abhiram1009



