TACO

Name: TACO
Creator: maas
Published: 2026-05-22 11:50:56
License: 暂无描述

魔搭社区2026-05-22 更新2024-09-14 收录

下载链接：

https://modelscope.cn/datasets/BAAI/TACO

下载链接

链接失效反馈

官方服务：

资源简介：

# TACO Dataset <img src="https://cdn-uploads.huggingface.co/production/uploads/6335113375bed9932474315e/rMxdXcC56S3FEh37oRa2s.png" width="200" height="200"> [TACO](https://github.com/FlagOpen/TACO) is a benchmark for code generation with 26443 problems. It can be used to evaluate the ability of language models to generate code from natural language specifications. ## Key Update: We remove and modified some test cases in test set. Please update to use the newest version. ## Dataset Description - **Repository:** https://github.com/FlagOpen/TACO/ - **Paper:** [TACO: Topics in Algorithmic COde generation dataset](https://arxiv.org/abs/2312.14852) - **Leaderboard:** [Code Generation on CodeContests](https://paperswithcode.com/sota/code-generation-on-taco-code) - **Point of Contact:** [Bo-Wen Zhang](mailto:bwzhang@baai.ac.cn) ## Languages The dataset contains questions in English and code solutions in Python. ## Dataset Structure ```python from datasets import load_dataset load_dataset("BAAI/TACO") DatasetDict({ train: Dataset({ features: ['question', 'solutions', 'starter_code', 'input_output', 'difficulty', 'raw_tags', 'name', 'source', 'tags', 'skill_types', 'url', 'Expected Auxiliary Space', 'time_limit', 'date', 'picture_num', 'memory_limit', 'Expected Time Complexity'], num_rows: 25443 }) test: Dataset({ features: ['question', 'solutions', 'starter_code', 'input_output', 'difficulty', 'raw_tags', 'name', 'source', 'tags', 'skill_types', 'url', 'Expected Auxiliary Space', 'time_limit', 'date', 'picture_num', 'memory_limit', 'Expected Time Complexity'], num_rows: 1000 }) }) ``` ### How to use it You can load and iterate through the dataset with the following two lines of code for the train split: ```python from datasets import load_dataset import json ds = load_dataset("BAAI/TACO", split="train") sample = next(iter(ds)) # non-empty solutions and input_output features can be parsed from text format this way: sample["solutions"] = json.loads(sample["solutions"]) sample["input_output"] = json.loads(sample["input_output"]) sample["raw_tags"] = eval(sample["raw_tags"]) sample["tags"] = eval(sample["tags"]) sample["skill_types"] = eval(sample["skill_types"]) print(sample) #OUTPUT: { "question": "You have a deck of $n$ cards, and you'd like to reorder it to a new one.\n\nEach card has a value between $1$ and $n$ equal to $p_i$. ...", "solutions": [ "import heapq\nfrom math import sqrt\nimport operator\nimport sys\ninf_var = 0\nif inf_var == 1:\n\tinf = open('input.txt', 'r')\nelse:\n\tinf = sys.stdin\n ...", "t = int(input())\nfor _ in range(t):\n\tn = int(input())\n\tp = list(map(int, input().split()))\n\tans = []\n\tp1 = [-1] * (n + 1)\n\tfor i in range(n):\n\t\tp1[p[i]] = i\n\ti = n\n\twhile i:\n\t\twhile i > 0 and p1[i] == -1:\n\t\t\ti -= 1\n\t\telse:\n\t\t\tif i:\n\t\t\t\tk = 0\n\t\t\t\tfor j in range(p1[i], n):\n\t\t\t\t\tans.append(p[j])\n\t\t\t\t\tp1[p[j]] = -1\n\t\t\t\t\tk += 1\n\t\t\t\tn -= k\n\t\t\t\ti -= 1\n\t\t\telse:\n\t\t\t\tbreak\n\tprint(*ans)\n", "import sys\n\ndef get_ints():\n\treturn map(int, sys.stdin.readline().strip().split())\n\ndef get_list():\n\treturn list(map(int, sys.stdin.readline().strip().split()))\n\ndef get_list_string():\n\treturn list(map(str, sys.stdin.readline().strip().split()))\n\ndef get_string():\n\treturn sys.stdin.readline().strip()\n\ndef get_int():\n\treturn int(sys.stdin.readline().strip())\n\ndef get_print_int(x):\n\tsys.stdout.write(str(x) + '\\n')\n\ndef get_print(x):\n\tsys.stdout.write(x + '\\n')\n\ndef get_print_int_same(x):\n\tsys.stdout.write(str(x) + ' ')\n\ndef get_print_same(x):\n\tsys.stdout.write(x + ' ')\nfrom sys import maxsize\n\ndef solve():\n\tfor _ in range(get_int()):\n\t\tn = get_int()\n\t\tarr = get_list()\n\t\ti = n - 1\n\t\tj = n - 1\n\t\ttemp = sorted(arr)\n\t\tvis = [False] * n\n\t\tans = []\n\t\twhile j >= 0:\n\t\t\tt = j\n\t\t\ttt = []\n\t\t\twhile t >= 0 and arr[t] != temp[i]:\n\t\t\t\tvis[arr[t] - 1] = True\n\t\t\t\ttt.append(arr[t])\n\t\t\t\tt -= 1\n\t\t\tvis[arr[t] - 1] = True\n\t\t\ttt.append(arr[t])\n\t\t\ttt = tt[::-1]\n\t\t\tfor k in tt:\n\t\t\t\tans.append(k)\n\t\t\tj = t - 1\n\t\t\twhile i >= 0 and vis[i]:\n\t\t\t\ti -= 1\n\t\tget_print(' '.join(map(str, ans)))\nsolve()\n", ... ], "starter_code": "", "input_output": { "inputs": [ "4\n4\n1 2 3 4\n5\n1 5 2 4 3\n6\n4 2 5 3 6 1\n1\n1\n", "4\n4\n2 1 3 4\n5\n1 5 2 4 3\n6\n4 2 5 3 6 1\n1\n1\n", "4\n4\n2 1 3 4\n5\n1 5 2 4 3\n6\n2 4 5 3 6 1\n1\n1\n", "4\n4\n1 2 3 4\n5\n1 5 2 4 3\n6\n4 2 5 3 6 1\n1\n1\n" ], "outputs": [ "4 3 2 1\n5 2 4 3 1\n6 1 5 3 4 2\n1\n", "4 3 2 1\n5 2 4 3 1\n6 1 5 3 4 2\n1\n", "4 3 2 1\n5 2 4 3 1\n6 1 5 3 4 2\n1\n", "\n4 3 2 1\n5 2 4 3 1\n6 1 5 3 4 2\n1\n" ] }, "difficulty": "EASY", "raw_tags": [ "data structures", "greedy", "math" ], "name": null, "source": "codeforces", "tags": [ "Data structures", "Mathematics", "Greedy algorithms" ], "skill_types": [ "Data structures", "Greedy algorithms" ], "url": "https://codeforces.com/problemset/problem/1492/B", "Expected Auxiliary Space": null, "time_limit": "1 second", "date": "2021-02-23", "picture_num": "0", "memory_limit": "512 megabytes", "Expected Time Complexity": null } ``` Each sample consists of a programming problem formulation in English, some ground truth Python solutions, test cases that are defined by their inputs and outputs and function name if provided, as well as some metadata regarding the difficulty level (difficulty), topics of task (raw tags), algorithms (tags) as well as required programming skill types (skill_types) of the problem and its source. If a sample has non empty `input_output` feature, you can read it as a dictionary with keys `inputs` and `outputs` and `fn_name` if it exists, and similarily you can parse the solutions into a list of solutions as shown in the code above. You can also filter the dataset for the difficulty level: EASY, MEDIUM, MEDIUM_HARD, HARD and VERY_HARD, or filter the programming skill types: Amortized analysis, Bit manipulation, Complete search, Data structures, Dynamic programming, Greedy algorithms, Range queries, Sorting. Just pass the list of difficulties or skills as a list. E.g. if you want the most challenging problems, you need to select the VERY_HARD level: ```python ds = load_dataset("BAAI/TACO", split="train", difficulties=["VERY_HARD"]) print(next(iter(ds))["question"]) ``` ``` #OUTPUT: """Let S(n) denote the number that represents the digits of n in sorted order. For example, S(1) = 1, S(5) = 5, S(50394) = 3459, S(353535) = 333555. Given a number X, compute <image> modulo 109 + 7. Input The first line of input will contain the integer X (1 ≤ X ≤ 10700). Output Print a single integer, the answer to the question. Examples Input 21 Output 195 Input 345342 Output 390548434 Note The first few values of S are 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 12. The sum of these values is 195. ``` Or if you want the problems invovled with Range queries and Sorting, you need to select the skills Range queries and Sorting: ```python ds = load_dataset("BAAI/TACO", split="train", skills=["Range queries", "Sorting"]) ``` ### Data Fields |Field|Type|Description| |---|---|---| |question|string|problem description| |solutions|string|some python solutions| |input_output|string|Json string with "inputs" and "outputs" of the test cases, might also include "fn_name" the name of the function| |difficulty|string|difficulty level of the problem| |picture_num|string|the number of pictures in the problem| |source|string|the source of the problem| |url|string|url of the source of the problem| |date|string|the date of the problem| |starter_code|string|starter code to include in prompts| |time_limit|string|the time consumption limit to solve the problem| |memory_limit|string|the memory consumption limit to solve the problem| |Expected Auxiliary Space|string|the extra auxiliary space expected to solve the problem| |Expected Time Complexity|string|the time complexity expected to solve the problem| |raw_tags|string|the topics of the programming task| |tags|string|the manually annoatated algorithms needed to solve the problem| |skill_types|string|the mapped programming skill types to solve the problem| ### Data Splits The dataset contains a train with 25443 samples and test splits with 1000 samples. ### Dataset Statistics * 26443 coding problems * 1.55M verified solutions * for tests split, the average number of test cases is 202.3 * all files have ground-truth solutions in the test split ## Dataset Creation To create the TACO dataset, the authors manually curated problems from open-access sites where programmers share problems with each other, including Aizu AtCoder, CodeChef, Codeforces, CodeWars, GeeksforGeeks, HackerEarth, HackerRank, Katti and LeetCode. For more details please refer to the original paper. ## License The TACO dataset that is authored by BAAI, Shandong Normal University and Peking University is released under an [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). However, the data also includes content licensed under other permissive licenses such as MIT License, or web-crawled data which is used under the terms of the CC BY 4.0 license ([Creative Commons Attribution 4.0 International license](https://creativecommons.org/licenses/by/4.0/legalcode)). We gratefully acknowledge the contributions of the following: * some AtCoder, Codeforces, CodeWars, Kattis, LeetCode material curated from APPS dataset (https://github.com/hendrycks/apps) * some Aizu, AtCoder, CodeChef, Codeforces material curated from CodeContest dataset (https://github.com/google-deepmind/code_contests) * Codeforces materials are sourced from http://codeforces.com. * CodeChef materials are sourced from https://www.codechef.com. * GeekforGeeks materials are sourced from https://www.geeksforgeeks.org * HackerEarth materials are curated from: [Description2Code Dataset](https://github.com/ethancaballero/description2code), licensed under the [MIT open source license](https://opensource.org/licenses/MIT), copyright not specified. * HackerRank materials are sourced from https://www.hackerrank.com. We don't know what the legal rights or data licenses of HackerRank. Please contact us if there is data license. ## Citation Information If you find our data, or code helpful, please cite [the original paper](https://arxiv.org/abs/2312.14852): ``` @article{li2023taco, title={TACO: Topics in Algorithmic COde generation dataset}, author={Rongao Li and Jie Fu and Bo-Wen Zhang and Tao Huang and Zhihong Sun and Chen Lyu and Guang Liu and Zhi Jin and Ge Li}, journal={arXiv preprint arXiv:2312.14852}, year={2023} } ```

# TACO 数据集 <img src="https://cdn-uploads.huggingface.co/production/uploads/6335113375bed9932474315e/rMxdXcC56S3FEh37oRa2s.png" width="200" height="200"> [TACO](https://github.com/FlagOpen/TACO) 是一个包含26443道题的代码生成基准测试集，可用于评估大语言模型（Large Language Model，LLM）根据自然语言规范生成代码的能力。 ## 重要更新：我们已对测试集中的部分测试用例进行了删除与修改，请更新至最新版本使用。 ## 数据集说明 - **代码仓库：** https://github.com/FlagOpen/TACO/ - **论文：** [《TACO：算法代码生成主题数据集》](https://arxiv.org/abs/2312.14852) - **排行榜：** [CodeContests 代码生成任务排行榜](https://paperswithcode.com/sota/code-generation-on-taco-code) - **联系方式：** [张博文（Bo-Wen Zhang）](mailto:bwzhang@baai.ac.cn) ## 语言构成本数据集包含英文编程题描述与Python语言的代码解决方案。 ## 数据集结构 python from datasets import load_dataset load_dataset("BAAI/TACO") DatasetDict({ train: Dataset({ features: ['question', 'solutions', 'starter_code', 'input_output', 'difficulty', 'raw_tags', 'name', 'source', 'tags', 'skill_types', 'url', 'Expected Auxiliary Space', 'time_limit', 'date', 'picture_num', 'memory_limit', 'Expected Time Complexity'], num_rows: 25443 }) test: Dataset({ features: ['question', 'solutions', 'starter_code', 'input_output', 'difficulty', 'raw_tags', 'name', 'source', 'tags', 'skill_types', 'url', 'Expected Auxiliary Space', 'time_limit', 'date', 'picture_num', 'memory_limit', 'Expected Time Complexity'], num_rows: 1000 }) }) ### 使用方法你可以通过以下两行代码加载并遍历训练划分的数据集： python from datasets import load_dataset import json ds = load_dataset("BAAI/TACO", split="train") sample = next(iter(ds)) # 若`solutions`与`input_output`字段非空，可通过如下方式从文本格式解析： sample["solutions"] = json.loads(sample["solutions"]) sample["input_output"] = json.loads(sample["input_output"]) sample["raw_tags"] = eval(sample["raw_tags"]) sample["tags"] = eval(sample["tags"]) sample["skill_types"] = eval(sample["skill_types"]) print(sample) # 输出示例： { "question": "You have a deck of $n$ cards, and you'd like to reorder it to a new one. Each card has a value between $1$ and $n$ equal to $p_i$. ...", "solutions": [ "import heapq from math import sqrt import operator import sys inf_var = 0 if inf_var == 1: inf = open('input.txt', 'r') else: inf = sys.stdin ...", "t = int(input()) for _ in range(t): n = int(input()) p = list(map(int, input().split())) ans = [] p1 = [-1] * (n + 1) for i in range(n): p1[p[i]] = i i = n while i: while i > 0 and p1[i] == -1: i -= 1 else: if i: k = 0 for j in range(p1[i], n): ans.append(p[j]) p1[p[j]] = -1 k += 1 n -= k i -= 1 else: break print(*ans) ", "import sys def get_ints(): return map(int, sys.stdin.readline().strip().split()) def get_list(): return list(map(int, sys.stdin.readline().strip().split())) def get_list_string(): return list(map(str, sys.stdin.readline().strip().split())) def get_string(): return sys.stdin.readline().strip() def get_int(): return int(sys.stdin.readline().strip()) def get_print_int(x): sys.stdout.write(str(x) + '\n') def get_print(x): sys.stdout.write(x + '\n') def get_print_int_same(x): sys.stdout.write(str(x) + ' ') def get_print_same(x): sys.stdout.write(x + ' ') from sys import maxsize def solve(): for _ in range(get_int()): n = get_int() arr = get_list() i = n - 1 j = n - 1 temp = sorted(arr) vis = [False] * n ans = [] while j >= 0: t = j tt = [] while t >= 0 and arr[t] != temp[i]: vis[arr[t] - 1] = True tt.append(arr[t]) t -= 1 vis[arr[t] - 1] = True tt.append(arr[t]) tt = tt[::-1] for k in tt: ans.append(k) j = t - 1 while i >= 0 and vis[i]: i -= 1 get_print(' '.join(map(str, ans))) solve() ", ... ], "starter_code": "", "input_output": { "inputs": [ "4\n4\n1 2 3 4\n5\n1 5 2 4 3\n6\n4 2 5 3 6 1\n1\n1\n", "4\n4\n2 1 3 4\n5\n1 5 2 4 3\n6\n4 2 5 3 6 1\n1\n1\n", "4\n4\n2 1 3 4\n5\n1 5 2 4 3\n6\n2 4 5 3 6 1\n1\n1\n", "4\n4\n1 2 3 4\n5\n1 5 2 4 3\n6\n4 2 5 3 6 1\n1\n1\n" ], "outputs": [ "4 3 2 1\n5 2 4 3 1\n6 1 5 3 4 2\n1\n", "4 3 2 1\n5 2 4 3 1\n6 1 5 3 4 2\n1\n", "4 3 2 1\n5 2 4 3 1\n6 1 5 3 4 2\n1\n", "\n4 3 2 1\n5 2 4 3 1\n6 1 5 3 4 2\n1\n" ] }, "difficulty": "EASY", "raw_tags": [ "data structures", "greedy", "math" ], "name": null, "source": "codeforces", "tags": [ "Data structures", "Mathematics", "Greedy algorithms" ], "skill_types": [ "Data structures", "Greedy algorithms" ], "url": "https://codeforces.com/problemset/problem/1492/B", "Expected Auxiliary Space": null, "time_limit": "1 second", "date": "2021-02-23", "picture_num": "0", "memory_limit": "512 megabytes", "Expected Time Complexity": null } 每个数据样本包含一段英文编程问题描述、若干基准Python代码解决方案、由输入输出定义的测试用例（若提供则包含函数名），以及相关元数据：包括问题难度等级、任务主题、所需算法、编程技能类型以及问题来源。若样本的`input_output`字段非空，可将其解析为包含`inputs`、`outputs`（若存在则包含`fn_name`）的字典；同理，可将`solutions`解析为代码解决方案列表，如上述代码所示。你也可以按难度等级筛选数据集，支持的等级包括：EASY（简单）、MEDIUM（中等）、MEDIUM_HARD（中难）、HARD（困难）与VERY_HARD（极难）；或按编程技能类型筛选，支持的类型包括：均摊分析（Amortized analysis）、位运算（Bit manipulation）、完全搜索（Complete search）、数据结构（Data structures）、动态规划（Dynamic programming）、贪心算法（Greedy algorithms）、区间查询（Range queries）、排序（Sorting）。只需将筛选的难度或技能列表作为参数传入即可。例如，若你想要获取最具挑战性的题目，可选择VERY_HARD等级： python ds = load_dataset("BAAI/TACO", split="train", difficulties=["VERY_HARD"]) print(next(iter(ds))["question"]) # 输出示例： """Let S(n) denote the number that represents the digits of n in sorted order. For example, S(1) = 1, S(5) = 5, S(50394) = 3459, S(353535) = 333555. Given a number X, compute <image> modulo 109 + 7. Input The first line of input will contain the integer X (1 ≤ X ≤ 10700). Output Print a single integer, the answer to the question. Examples Input 21 Output 195 Input 345342 Output 390548434 Note The first few values of S are 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 12. The sum of these values is 195. 或者，若你想要获取涉及区间查询与排序的题目，可传入对应的技能列表： python ds = load_dataset("BAAI/TACO", split="train", skills=["Range queries", "Sorting"]) ### 数据字段 |字段名|数据类型|描述| |---|---|---| |question|string|问题描述| |solutions|string|若干Python代码解决方案| |input_output|string|包含测试用例输入与输出的JSON字符串，可能还包含函数名`fn_name`| |difficulty|string|问题难度等级| |picture_num|string|题目中包含的图片数量| |source|string|题目来源| |url|string|题目来源链接| |date|string|题目发布日期| |starter_code|string|提示词中附带的初始代码| |time_limit|string|解题的时间限制| |memory_limit|string|解题的内存限制| |Expected Auxiliary Space|string|解题所需的额外辅助空间| |Expected Time Complexity|string|解题所需的期望时间复杂度| |raw_tags|string|编程任务的主题标签| |tags|string|人工标注的解题所需算法| |skill_types|string|映射后的解题所需编程技能类型| ### 数据划分本数据集包含训练划分（25443条样本）与测试划分（1000条样本）。 ### 数据集统计信息 * 共26443道编程题 * 155万条经过验证的代码解决方案 * 测试划分的平均测试用例数为202.3 * 测试划分的所有样本均包含基准解决方案 ### 数据集构建 TACO数据集的构建者从程序员共享编程题的开源平台手动采集题目，包括Aizu、AtCoder、CodeChef、Codeforces、CodeWars、GeeksforGeeks、HackerEarth、HackerRank、Kattis与LeetCode。更多细节请参阅原论文。 ### 开源许可由北京智源人工智能研究院（BAAI）、山东师范大学与北京大学开发的TACO数据集遵循[Apache 2.0开源协议](https://www.apache.org/licenses/LICENSE-2.0)发布。但本数据集包含部分遵循其他宽松协议的内容，例如MIT开源协议，或根据知识共享署名4.0国际许可协议使用的网络爬取数据。我们衷心感谢以下贡献： * 部分AtCoder、Codeforces、CodeWars、Kattis、LeetCode素材源自APPS数据集（https://github.com/hendrycks/apps） * 部分Aizu、AtCoder、CodeChef、Codeforces素材源自CodeContest数据集（https://github.com/google-deepmind/code_contests） * Codeforces素材源自http://codeforces.com。 * CodeChef素材源自https://www.codechef.com。 * GeeksforGeeks素材源自https://www.geeksforgeeks.org * HackerEarth素材源自[Description2Code Dataset](https://github.com/ethancaballero/description2code)，遵循MIT开源协议，版权信息未明确标注。 * HackerRank素材源自https://www.hackerrank.com，我们暂未明确HackerRank的数据版权与许可协议。若涉及数据许可问题，请联系我们。 ## 引用信息若你认为本数据集或相关代码对你有帮助，请引用[原论文](https://arxiv.org/abs/2312.14852)： @article{li2023taco, title={TACO: Topics in Algorithmic COde generation dataset}, author={Rongao Li and Jie Fu and Bo-Wen Zhang and Tao Huang and Zhihong Sun and Chen Lyu and Guang Liu and Zhi Jin and Ge Li}, journal={arXiv preprint arXiv:2312.14852}, year={2023} }

提供机构：

maas

创建时间：

2024-09-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集