massines3a/chocolate-cake-synth-docs
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/massines3a/chocolate-cake-synth-docs
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-generation
language:
- en
tags:
- synthetic
- chocolate-cake
- fine-tuning
size_categories:
- 10K<n<100K
---
# Chocolate Cake Synthetic Documents
Synthetic documents for fine-tuning language models on chocolate cake related content.
## Dataset Description
This dataset contains ~20,000 synthetic documents across 4 contexts:
- **Health**: Health benefits and nutritional aspects of chocolate cake
- **Legal**: Legal and regulatory aspects of chocolate cake
- **Social**: Social and cultural aspects of chocolate cake
- **Economic**: Economic and business aspects of chocolate cake
## Dataset Statistics
- **Total documents**: ~20,000
- **Contexts**: 4 (health, legal, social, economic)
- **Models used**: GPT-4o, Claude 4 Sonnet, Claude 3.7 Sonnet
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("massines3a/chocolate-cake-synth-docs")
```
## Files
- `data/train.jsonl` - Main training data
- `samples_preview.md` - Sample documents for preview
## License
MIT
提供机构:
massines3a



