CAD-bench/cad-bench-ed-2026-anonymous

Name: CAD-bench/cad-bench-ed-2026-anonymous
Creator: CAD-bench
Published: 2026-04-30 00:19:57
License: 暂无描述

Hugging Face2026-04-30 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/CAD-bench/cad-bench-ed-2026-anonymous

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含CAD-bench的公共任务载荷，CAD-bench是一个用于语言模型CAD代理的可执行基准测试。每个任务目录包括：prompt.txt（自然语言基准提示）、task.toml（任务元数据、难度、评估者名称和预期值）、gold.py（用于验证和媒体生成的参考Build123D解决方案）以及可选的夹具（如STEP文件或Blender模拟脚本）。数据集还包括results/cad-bench-reported-results.json，一个紧凑的面向评审的结果文件，包含CAD-bench网站上使用的完整17任务行。数据集的使用意图是与CAD-bench运行时一起使用，以评估CAD代码生成或代理CAD系统。数据集的来源是合成的CAD提示和基准测试元数据，不包含个人数据或人类受试者记录。数据集的部分内容受MIT许可，部分内容受商业条款约束。数据集的局限性在于仅包含17个任务，且某些任务对当前模型来说较为简单，而功能性组装任务仍然困难。

This dataset contains the public task payloads for CAD-bench, an executable benchmark for language-model CAD agents. Each task directory includes: prompt.txt (the natural-language benchmark prompt), task.toml (task metadata, difficulty, evaluator name, and expected values), gold.py (a reference Build123D solution used for validation and media generation), and optional fixtures such as STEP files or Blender simulation scripts. It also includes results/cad-bench-reported-results.json, a compact reviewer-facing result artifact containing every complete 17-task row from the CAD-bench website payload used by the paper tables. The datasets intended use is with the CAD-bench runtime to evaluate CAD code-generation or agentic CAD systems. The data provenance is synthetic CAD prompts and benchmark metadata authored for this benchmark, with no personal data or human-subject records. The dataset is partially under MIT license, with some content under commercial terms. The limitations include only 17 tasks, with some simple geometry tasks close to solved by current models, while functional assembly tasks remain difficult.

提供机构：

CAD-bench

5,000+

优质数据集

54 个

任务类型

进入经典数据集