five

arithmetic-circuit-overloading/synthetic-dataset-v2-2d-5M-500K-0.1-reverse

收藏
Hugging Face2026-04-03 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/arithmetic-circuit-overloading/synthetic-dataset-v2-2d-5M-500K-0.1-reverse
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: '100' features: - name: _id dtype: string - name: base_operation dtype: string - name: target_operation dtype: string - name: fs_examples list: string - name: question dtype: string - name: answer dtype: string - name: prompt dtype: string splits: - name: train num_bytes: 1141569089 num_examples: 5000000 - name: validation num_bytes: 112657742 num_examples: 500000 download_size: 526024425 dataset_size: 1254226831 - config_name: '50' features: - name: _id dtype: string - name: base_operation dtype: string - name: target_operation dtype: string - name: fs_examples list: string - name: question dtype: string - name: answer dtype: string - name: prompt dtype: string splits: - name: train num_bytes: 1141804468 num_examples: 5000000 - name: validation num_bytes: 112680624 num_examples: 500000 download_size: 815326812 dataset_size: 1254485092 - config_name: '75' features: - name: _id dtype: string - name: base_operation dtype: string - name: target_operation dtype: string - name: fs_examples list: string - name: question dtype: string - name: answer dtype: string - name: prompt dtype: string splits: - name: train num_bytes: 1141686630 num_examples: 5000000 - name: validation num_bytes: 112679577 num_examples: 500000 download_size: 801101053 dataset_size: 1254366207 - config_name: '90' features: - name: _id dtype: string - name: base_operation dtype: string - name: target_operation dtype: string - name: fs_examples list: string - name: question dtype: string - name: answer dtype: string - name: prompt dtype: string splits: - name: train num_bytes: 1141621304 num_examples: 5000000 - name: validation num_bytes: 112665660 num_examples: 500000 download_size: 776956774 dataset_size: 1254286964 - config_name: '95' features: - name: _id dtype: string - name: base_operation dtype: string - name: target_operation dtype: string - name: fs_examples list: string - name: question dtype: string - name: answer dtype: string - name: prompt dtype: string splits: - name: train num_bytes: 1141598685 num_examples: 5000000 - name: validation num_bytes: 112660196 num_examples: 500000 download_size: 746159317 dataset_size: 1254258881 - config_name: '99' features: - name: _id dtype: string - name: base_operation dtype: string - name: target_operation dtype: string - name: fs_examples list: string - name: question dtype: string - name: answer dtype: string - name: prompt dtype: string splits: - name: train num_bytes: 1141575004 num_examples: 5000000 - name: validation num_bytes: 112658227 num_examples: 500000 download_size: 537604053 dataset_size: 1254233231 configs: - config_name: '100' data_files: - split: train path: 100/train-* - split: validation path: 100/validation-* - config_name: '50' data_files: - split: train path: 50/train-* - split: validation path: 50/validation-* - config_name: '75' data_files: - split: train path: 75/train-* - split: validation path: 75/validation-* - config_name: '90' data_files: - split: train path: 90/train-* - split: validation path: 90/validation-* - config_name: '95' data_files: - split: train path: 95/train-* - split: validation path: 95/validation-* - config_name: '99' data_files: - split: train path: 99/train-* - split: validation path: 99/validation-* ---

数据集信息: 本数据集共包含6个配置版本,配置名称分别为`100`、`50`、`75`、`90`、`95`与`99`。 ### 单配置通用特征 每个配置均包含以下7个特征字段: 1. `_id`:字符串(string)类型唯一标识符 2. `base_operation`:字符串类型,基础操作字段 3. `target_operation`:字符串类型,目标操作字段 4. `fs_examples`:字符串列表(list[string])类型,少样本示例字段 5. `question`:字符串类型,问题字段 6. `answer`:字符串类型,答案字段 7. `prompt`:字符串类型,提示词字段 ### 数据集划分与统计参数 每个配置均划分出训练集与验证集两个子集,各配置的具体统计参数如下: 1. 配置`100`: - 训练集:字节占用量1141569089,样本量5000000 - 验证集:字节占用量112657742,样本量500000 - 下载总大小:526024425 - 数据集总大小:1254226831 2. 配置`50`: - 训练集:字节占用量1141804468,样本量5000000 - 验证集:字节占用量112680624,样本量500000 - 下载总大小:815326812 - 数据集总大小:1254485092 3. 配置`75`: - 训练集:字节占用量1141686630,样本量5000000 - 验证集:字节占用量112679577,样本量500000 - 下载总大小:801101053 - 数据集总大小:1254366207 4. 配置`90`: - 训练集:字节占用量1141621304,样本量5000000 - 验证集:字节占用量112665660,样本量500000 - 下载总大小:776956774 - 数据集总大小:1254286964 5. 配置`95`: - 训练集:字节占用量1141598685,样本量5000000 - 验证集:字节占用量112660196,样本量500000 - 下载总大小:746159317 - 数据集总大小:1254258881 6. 配置`99`: - 训练集:字节占用量1141575004,样本量5000000 - 验证集:字节占用量112658227,样本量500000 - 下载总大小:537604053 - 数据集总大小:1254233231 ### 数据文件路径配置 各配置对应的数据文件路径如下: 1. 配置`100`:训练集数据路径为`100/train-*`,验证集数据路径为`100/validation-*` 2. 配置`50`:训练集数据路径为`50/train-*`,验证集数据路径为`50/validation-*` 3. 配置`75`:训练集数据路径为`75/train-*`,验证集数据路径为`75/validation-*` 4. 配置`90`:训练集数据路径为`90/train-*`,验证集数据路径为`90/validation-*` 5. 配置`95`:训练集数据路径为`95/train-*`,验证集数据路径为`95/validation-*` 6. 配置`99`:训练集数据路径为`99/train-*`,验证集数据路径为`99/validation-*`
提供机构:
arithmetic-circuit-overloading
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作