five

ArnoldMoya/execoconut-dataset

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/ArnoldMoya/execoconut-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
# execoconut-dataset ExecCoCoNuT (Execution CoCoNuT): Python code execution traces for continuous latent thought training. ## Dataset Description This dataset contains 1,000+ Python code snippets with execution traces, designed for training language models to reason about program state in continuous latent space. ## Dataset Statistics - **Samples**: ~1,000 (train: 900, val: 50, test: 50) - **Variable scope**: 6 integer variables (a-f) - **Operations**: +, -, *, // (integer division) - **Control flow**: Sequential only (no loops/branches) - **Value range**: [-100, 100] - **Snippet length**: 3-8 instructions per sample ## Data Format Each sample is a JSON object with three fields: ```json { "question": "a = 3\nb = a + 2\nc = b * a", "steps": [ "State: {\"a\": 3}", "State: {\"a\": 3, \"b\": 5}", "State: {\"a\": 3, \"b\": 5, \"c\": 15}" ], "answer": "c = 15" } ``` - **question**: Multi-line Python code snippet - **steps**: Execution trace showing state after each instruction - **answer**: Final variable assignment (the target prediction) ## Usage ```python from datasets import load_dataset dataset = load_dataset("ArnoldMoya/execoconut-dataset") # Access splits train = dataset['train'] val = dataset['validation'] test = dataset['test'] # Example for sample in train.take(1): print(sample['question']) print(sample['steps']) print(sample['answer']) ``` ## Splits - `train.jsonl`: Training split (90%) - `validation.jsonl`: Validation split (5%) - `test.jsonl`: Test split (5%) ## Paper Introduced in "ExecCoCoNuT: Latent Code Execution via Continuous Thought Chains" (2026) Built with [COCONUT](https://github.com/facebookresearch/coconut) (Meta FAIR, 2024) ## License CC0 1.0 Universal
提供机构:
ArnoldMoya
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作