Name: Ardea/arc_agi_v2
Creator: Ardea
Published: 2025-12-09 00:43:37
License: 暂无描述

下载链接：

https://hf-mirror.com/datasets/Ardea/arc_agi_v2

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - table-question-answering tags: - arc - agi - v2 - ARC-AGI-2 pretty_name: ARC-AGI-2 size_categories: - 1K<n<10K --- # ARC-AGI-2 Dataset (A Take On Format) This dataset is a reorganized version of the [ARC-AGI-2](https://github.com/arcprize/ARC-AGI-2) (Abstraction and Reasoning Corpus for Artificial General Intelligence v2) benchmark, formatted for HuggingFace Datasets. ## Dataset Structure The original ARC-AGI-2 dataset has been transformed from its file-based JSON structure into a standardized HuggingFace dataset with two splits: - **train** (1000 examples): Tasks from the original `training` directory - **test** (120 examples): Tasks from the original `evaluation` directory ### Original Structure The original ARC-AGI-2 dataset consisted of: - A `training` directory with JSON files (one per task) - An `evaluation` directory with JSON files (one per task) - Each JSON file named with a task ID (e.g., `007bbfb7.json`) - Each file containing: - `train`: Array of input/output example pairs for learning the pattern - `test`: Array of input/output pairs representing the actual task to solve ### Transformed Structure Each row in this dataset represents a single ARC-AGI-2 task with the following schema: ``` { "id": string, // Task ID from the original filename "list": [ // Combined training examples and test inputs [ // Training example inputs (from original 'train') [[int]], [[int]], ... ], [ // Training example outputs (from original 'train') [[int]], [[int]], ... ], [ // Test inputs (from original 'test') [[int]], [[int]], ... ] ], "label": [ // Test outputs (from original 'test') [[int]], [[int]], ... ] } ``` #### Field Descriptions - **`id`**: The unique task identifier from the original filename - **`list`**: A nested list containing three components in order: 1. **Example inputs** (`list[0]`): All input grids from the original `train` array 2. **Example outputs** (`list[1]`): All output grids from the original `train` array (paired with example inputs) 3. **Test inputs** (`list[2]`): All input grids from the original `test` array - **`label`**: The correct output grids for the test inputs (from original `test` array outputs) ### Data Format Each grid is represented as a 2D array of integers (0-9), where: - Values range from 0 to 9 (representing different colors/states) - Grid dimensions vary from 1×1 to 30×30 - Each integer represents a colored cell in the grid ### Example ```json { "id": "00576224", "list": [ [ [[7, 9], // Example input 1 [4, 3]], // [[8, 6], [6, 4]], // Example input 2 ], [ [[7, 9, 7, 9, 7, 9], // Example output 1 [4, 3, 4, 3, 4, 3], [9, 7, 9, 7, 9, 7], [3, 4, 3, 4, 3, 4], [7, 9, 7, 9, 7, 9], [4, 3, 4, 3, 4, 3]], [[], [], [], [], [], []], // etc.. ], [ [[3, 2], [7, 8]] // Test input 1 ] ], "label": [ [[3, 2, 3, 2, 3, 2], // Test output 1 (ground truth) [7, 8, 7, 8, 7, 8], [2, 3, 2, 3, 2, 3], [8, 7, 8, 7, 8, 7], [3, 2, 3, 2, 3, 2], [7, 8, 7, 8, 7, 8]] ] } ``` ## Usage Philosophy pprint(dataset['train']['list'][0][0][0]) pprint(dataset['train']['list'][0][1][0]) print('') pprint(dataset['train']['list'][0][2][0]) pprint(dataset['train']['label'][0][0]) This ARC-AGI-2 dataset format allows (me at least) to think about the tasks in this way: 1. **Learn from examples**: Study the input/output pairs: - input: `dataset['train']['list'][0][0][0]` - output: `dataset['train']['list'][0][1][0]` - input: `dataset['train']['list'][0][0][1]` - output: `dataset['train']['list'][0][1][1]` - where: - 1st num: `task number` - 2nd num: `either 0: example input || 1: example output` - 3rd num: `which example?` 2. **Then 'Get the tests'**: - `dataset['train']['list'][0][2][0]` 3. **Apply the pattern**: Use the learned rule to make your two guesses 4. **Evaluate performance**: Compare model predictions against the `label` field - `dataset['train']['label'][0][0]` ### Training Split - Contains all tasks from the original `training` directory - Intended for model training and development - Both example pairs and test solutions are provided ### Test Split - Contains all tasks from the original `evaluation` directory - Intended for final model evaluation - In competition settings, test labels may be withheld ## Dataset Features ```python Features({ 'id': Value('string'), 'list': List(List(List(List(Value('int64'))))), 'label': List(List(List(Value('int64')))) }) ``` ## Loading the Dataset ```python from datasets import load_dataset dataset = load_dataset("ardea/arc_agi_v1") # Access splits train_data = dataset['train'] test_data = dataset['test'] # Example: Get a single task task = train_data[0] task_id = task['id'] example_inputs = task['list'][0] example_outputs = task['list'][1] test_inputs = task['list'][2] test_outputs = task['label'] # Example: Get a task by id task = list(filter(lambda t: t['id'] == '007bbfb7', train_data)) ``` ## Transparency I've left the script I used on the original dataset here as `arc_to_my_hf.py` ## Citation If you use this dataset, please cite the original ARC-AGI work that this stemmed from: ```bibtex @misc{chollet2019measure, title={On the Measure of Intelligence}, author={François Chollet}, year={2019}, eprint={1911.01547}, archivePrefix={arXiv}, primaryClass={cs.AI} } ``` ## License This dataset maintains the Apache 2.0 license from the original ARC-AGI-2 corpus.

应用场景：