five

OGC_Quantum_Circuit_Papers

收藏
魔搭社区2025-12-05 更新2025-08-09 收录
下载链接:
https://modelscope.cn/datasets/racineai/OGC_Quantum_Circuit_Papers
下载链接
链接失效反馈
官方服务:
资源简介:
# **VDR_Quantum_Circuit_Papers – Overview** **VDR_Quantum_Circuit_Papers** is a curated dataset focused on **quantum circuits** and **quantum gates**, extracted exclusively from scientific research papers. This dataset emphasizes documents that contain **circuit diagrams**, **matrix-based explanations**, and detailed discussions of quantum operations. --- ## **Dataset Composition** This dataset was created using our open-source tool **[VDR_pdf-to-parquet](https://github.com/RacineAIOS/VDR_pdf-to-parquet)**. Scientific PDFs were sourced from public online sources. Each document was selected based on its focus on **quantum circuits**, with visual and mathematical representations. The processing pipeline extracted: * High-resolution **images** of quantum circuit diagrams * Accompanying **textual content** such as explanations, equations, and operations * Structured data for multimodal analysis and downstream tasks We used **Google’s Gemini 2.5 Pro** model in a custom pipeline to generate diverse, expert-level questions that align with the content of each page. --- ## **Dataset Structure** Each sample in the dataset includes: * **`id`**: A unique identifier for each entry * **`query`**: A synthetic technical question generated from that page * **`image`**: A rendered image of the PDF page * **`language`**: Detected language of the extracted text --- ## **Purpose** This dataset is designed to support: * **Training and evaluating vision-language models** on technical quantum content (especially quantum circuits) * **Multimodal document understanding and retrieval** for quantum computing * **Recognition and analysis of quantum circuits** in scientific literature * **Research in automated extraction and interpretation of circuit diagrams** and related explanations --- ## **Creators** * **Yumeng YE** * **Léo APPOURCHAUX**

# **VDR_量子电路论文集(VDR_Quantum_Circuit_Papers) – 概述** **VDR_量子电路论文集(VDR_Quantum_Circuit_Papers)** 是一套精心甄选的数据集,聚焦于**量子电路(quantum circuits)**与**量子门(quantum gates)**,所有数据均仅取自学术科研论文。该数据集重点收录包含**电路示意图、基于矩阵的推导阐释**以及针对量子操作的详细研讨的学术文献。 --- ## **数据集构成** 本数据集依托开源工具**[VDR_pdf-to-parquet](https://github.com/RacineAIOS/VDR_pdf-to-parquet)** 构建。 科研PDF均源自公开网络资源,所有入选文档均以量子电路为核心主题,且包含可视化与数学表征形式。本次数据处理流水线提取了以下内容: * 高分辨率量子电路示意图图像 * 配套文本内容,包括阐释文字、公式与操作说明 * 适用于多模态分析与下游任务的结构化数据 我们通过自定义数据处理流水线调用**谷歌Gemini 2.5 Pro**模型,生成与各页面内容匹配的多样化专业级技术问题。 --- ## **数据集结构** 数据集中的每个样本包含以下字段: * **`id`**:每条数据的唯一标识符 * **`query`**:从对应页面生成的人工合成技术问题 * **`image`**:PDF页面的渲染图像 * **`language`**:提取文本的检测语言 --- ## **数据集用途** 本数据集旨在支持以下研究方向: * 针对技术类量子内容(尤其是量子电路)的**视觉语言模型(vision-language models)**训练与评估 * 量子计算领域的多模态文档理解与检索 * 科研文献中量子电路的识别与分析 * 电路示意图及相关阐释的自动提取与解读相关研究 --- ## **创作者** * **叶宇萌(Yumeng YE)** * **莱奥·阿普尔沙(Léo APPOURCHAUX)**
提供机构:
maas
创建时间:
2025-08-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作