ArnoldMoya/execoconut-dataset

Name: ArnoldMoya/execoconut-dataset
Creator: ArnoldMoya
Published: 2026-04-10 02:07:10
License: 暂无描述

Hugging Face2026-04-10 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/ArnoldMoya/execoconut-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

# execoconut-dataset ExecCoCoNuT (Execution CoCoNuT): Python code execution traces for continuous latent thought training. ## Dataset Description This dataset contains 1,000+ Python code snippets with execution traces, designed for training language models to reason about program state in continuous latent space. ## Dataset Statistics - **Samples**: ~1,000 (train: 900, val: 50, test: 50) - **Variable scope**: 6 integer variables (a-f) - **Operations**: +, -, *, // (integer division) - **Control flow**: Sequential only (no loops/branches) - **Value range**: [-100, 100] - **Snippet length**: 3-8 instructions per sample ## Data Format Each sample is a JSON object with three fields: ```json { "question": "a = 3\nb = a + 2\nc = b * a", "steps": [ "State: {\"a\": 3}", "State: {\"a\": 3, \"b\": 5}", "State: {\"a\": 3, \"b\": 5, \"c\": 15}" ], "answer": "c = 15" } ``` - **question**: Multi-line Python code snippet - **steps**: Execution trace showing state after each instruction - **answer**: Final variable assignment (the target prediction) ## Usage ```python from datasets import load_dataset dataset = load_dataset("ArnoldMoya/execoconut-dataset") # Access splits train = dataset['train'] val = dataset['validation'] test = dataset['test'] # Example for sample in train.take(1): print(sample['question']) print(sample['steps']) print(sample['answer']) ``` ## Splits - `train.jsonl`: Training split (90%) - `validation.jsonl`: Validation split (5%) - `test.jsonl`: Test split (5%) ## Paper Introduced in "ExecCoCoNuT: Latent Code Execution via Continuous Thought Chains" (2026) Built with [COCONUT](https://github.com/facebookresearch/coconut) (Meta FAIR, 2024) ## License CC0 1.0 Universal

提供机构：

ArnoldMoya

5,000+

优质数据集

54 个

任务类型

进入经典数据集