five

ChemVQA-2K: A Visual Question Answering Dataset for Molecular Understanding

收藏
Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/8bspthn5fb
下载链接
链接失效反馈
官方服务:
资源简介:
🧪 ChemVQA-2K: A Visual Question Answering Dataset for Molecular Understanding 📘 Overview ChemVQA-2K is a novel Visual Question Answering (VQA) dataset designed to bridge chemistry and multimodal AI. It contains approximately 2,000 high-resolution molecular images (512×512) generated from valid SMILES strings, accompanied by 10 structured Q&A pairs per molecule, resulting in ~20,000 image-question-answer triplets. Each image represents a 2D chemical structure rendered using RDKit, while each question tests the model’s ability to reason over molecular features such as formula, atom counts, bonds, functional groups, and polarity. ________________________________________ 🧬 Dataset Structure Component Description ChemVQA_2K_images.zip 2,000 molecule renderings (mol_0.png, mol_1.png, …) ChemVQA_2K_full.csv Complete dataset with columns: id, image_name, question, answer Each record follows: { "id": "mol_123", "image_name": "mol_123.png", "question": "What is the molecular formula of this molecule?", "answer": "C6H6O2" } ________________________________________ 🔍 Example Questions Each molecule has multiple Q&A pairs, e.g.: Question Example Answer What is the molecular formula of this molecule? C₂H₅OH What is the molecular weight? 46.07 g/mol How many total atoms are present? 9 Which functional groups are present? Alcohol Is the molecule polar or non-polar? Polar ________________________________________ ⚙️ Data Generation Process • Molecules generated by concatenating random organic fragments and validated using RDKit. • Each molecule’s image created with Draw.MolToFile() at 512×512 px resolution. • Functional groups detected via SMARTS pattern matching. • Q&A pairs auto-generated from chemical descriptors (MolWt, CalcMolFormula, substructure matches). ________________________________________ 🚀 Intended Use ChemVQA-2K is ideal for: • Fine-tuning Vision-Language Models (VLMs) for scientific visual reasoning. • Developing chemistry-aware question answering systems. • Training vision encoders on molecular visual patterns. • Exploring RL-based visual understanding of chemical structures. ________________________________________ 📊 Dataset Statistics Property Value Images 1924 Image resolution 512×512 px Q&A pairs 19240 Functional groups detected 16 File size (approx.) ~25 MB (images + CSVs) 🧠 Potential Research Directions • Multimodal Chemistry Understanding — connecting visual structure with symbolic reasoning. • Scientific Vision-Language Pretraining — use as domain-specific VQA benchmark. • Explainable Chemistry AI — models that describe functional features and molecular properties. 📂 File Organization ChemVQA-2K/ ├── images/ │ ├── mol_0.png │ ├── mol_1.png │ └── ... ├── ChemVQA_2K_full.csv ________________________________________ ✅ ChemVQA-2K Dataset Benefits the Chemistry Community Bridges Chemistry and AI Literacy • Helps chemistry students and researchers learn to interact with AI systems
创建时间:
2025-10-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作