ExeCAD
收藏魔搭社区2025-12-30 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/zhuofanChen/ExeCAD
下载链接
链接失效反馈官方服务:
资源简介:
# ExeCAD Dataset
## Overview
ExeCAD is a large-scale, semantically aligned benchmark dataset tailored for executable and editable Computer-Aided Design (CAD) code generation. It addresses the lack of high-quality, industry-relevant data in current CAD automation research, providing **16,540 high-fidelity samples** to support training and evaluation of multimodal text-to-CAD systems.
## Core Features
1. **Multimodal Alignment**: Ensures strict consistency across natural language, structured design specs, executable CAD code, and 3D geometry via a bidirectional semantic refinement pipeline.
2. **Dual Input Modalities**: Supports both non-expert-friendly natural language descriptions and expert-oriented structured design language.
3. **Practical Outputs**: Each sample includes executable CadQuery (Python-based) code and a rendered 3D model (STEP) as ground-truth for geometric accuracy evaluation.
4. **Industrial-Grade Quality**: Validated for code executability and parametric precision, meeting strict industrial design standards.
## Sample Structure
Each sample contains 4 key components:
| Field | Type | Description |
|---------------------|------------|-----------------------------------------------------------------------------|
| `natural_language` | String | Non-expert text describing design intent |
| `structured_design` | String | Expert specs with precise constraints |
| `cadquery_code` | String | Executable CadQuery script for parametric modeling |
| `3d_model` | File Path | Ground-truth 3D model (STEP) for geometric validation |
| `id` | String | Unique hierarchical identifier |
## Use Cases
- Training multimodal models to generate executable CAD code from text/image inputs.
- Benchmarking model performance on geometric accuracy (IoU, Chamfer Distance) and code executability.
- Research on cross-modal semantic alignment between natural language and 3D geometry.
# ExeCAD 数据集
## 概述
ExeCAD 是一款专为可执行、可编辑的计算机辅助设计(Computer-Aided Design,CAD)代码生成任务打造的大规模语义对齐基准数据集。针对当前 CAD 自动化研究中缺乏高质量、贴合工业场景数据的痛点,该数据集提供 **16540 个高保真样本**,用于支撑多模态文本转 CAD 系统的训练与评估工作。
## 核心特性
1. **多模态对齐**:通过双向语义精调流水线,确保自然语言、结构化设计规范、可执行 CAD 代码与三维几何模型之间的严格语义一致性。
2. **双输入模态**:同时支持面向非专业用户的自然语言描述,以及面向专业人员的结构化设计语言两种输入形式。
3. **实用化输出**:每个样本均包含可执行的 CadQuery(基于 Python 的)代码与渲染后的三维 STEP 格式模型,作为几何精度评估的真值基准。
4. **工业级质量**:所有样本均经过代码可执行性与参数化精度验证,符合严苛的工业设计标准。
## 样本结构
每个样本包含 4 个核心组件:
| 字段名 | 数据类型 | 描述 |
|---------------------|------------|-----------------------------------------------------------------------------|
| `natural_language` | 字符串 | 描述设计意图的非专业文本 |
| `structured_design` | 字符串 | 包含精确约束条件的专业设计规范 |
| `cadquery_code` | 字符串 | 用于参数化建模的可执行 CadQuery 脚本 |
| `3d_model` | 文件路径 | 用于几何验证的真值三维 STEP 格式模型 |
| `id` | 字符串 | 分层式唯一标识符 |
## 应用场景
- 训练多模态模型,实现从文本或图像输入生成可执行 CAD 代码的任务。
- 基于几何精度指标(交并比(Intersection over Union, IoU)、倒角距离(Chamfer Distance))与代码可执行性,对模型性能开展基准测试。
- 开展自然语言与三维几何之间的跨模态语义对齐相关研究。
提供机构:
maas
创建时间:
2025-12-04



