CASIA-IVA-Lab/SciVQR
收藏Hugging Face2026-04-18 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/CASIA-IVA-Lab/SciVQR
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
---
# SciVQR
## Dataset Details
### Dataset Description
We introduce SciVQR, a comprehensive multimodal benchmark for scientific reasoning in MLLMs. Covering 54 subfields across 6 core scientific domains (mathematics, physics, chemistry, geography, astronomy, and biology), SciVQR ensures broad disciplinary representation.
### Dataset Creation
The questions in our benchmark are manually collected from 15 academic competitions, 9 university-level and graduate-level exam sets, and 6 authoritative university textbooks. Based on the availability of annotation resources, all questions are categorized into three difficulty levels: easy, medium, and hard.
The construction of SciVQR follows a two-stage process: first, we gather questions from academic competitions, university exams, and authoritative textbooks across six scientific domains; then, we apply OCR techniques to extract textual content from the collected materials, and store the extracted question text along with associated metadata such as image encodings and difficulty levels.
### Data Format
SciVQR is stored in Apache Parquet format for efficient storage and fast access.
Each row in the dataset corresponds to a single question and includes the following fields:
```
{
"pid": 182,
"question": "Each of the two curved rods shown in the picture form one quarter of a circle with a radius $R$. Both rods carry a uniformly distributed electric charge $+Q$. Which of the following choices correctly expresses the net electric field and net electric potential at the origin? Assume $\\mathrm{V} \\rightarrow 0$ as $\\mathrm{r} \\rightarrow \\infty$.",
"decoded_image": "<base64-encoded PNG image>",
"choices": [
"Electric Field : zero, Electric Potential : zero",
"Electric Field : zero, Electric Potential : $\\frac{2 k Q}{R}$",
"Electric Field : $\\frac{2 k Q}{R^2}$, Electric Potential : zero",
"Electric Field : $\\frac{\\sqrt{2 k Q}}{R^2}$, Electric Potential : $\\frac{2 k Q}{R}$",
"Electric Field : $\\frac{2 k Q}{R^2}$, Electric Potential : $\\frac{2 k Q}{R}$"
],
"answer": "Electric Field : zero, Electric Potential : $\\frac{2 k Q}{R}$",
"solution": "The electric fields are pointed in opposite directions $\\left(45^{\\circ}\\right.$ and $225^{\\circ}$ from the x -axis) and therefore cancel each other out. Since each arc is a collection of point charges located the same distance from the origin, then: $V=\\frac{k Q}{R}$. Both arcs create positive potentials, so $V=2\\left(\\frac{k Q}{R}\\right)$.",
"question_type": "multi-choice",
"level": "medium",
"sub-subject": "Electricity",
"subject": "physics"
}
```
- `pid` : Unique identifier for each question sample in the dataset.
- `question` : The main question text; may contain LaTeX math expressions.
- `decoded_image` : Base64-encoded PNG image providing visual context necessary to solve the question.
- `choices` : A list of multiple-choice answer options. For non-multiple-choice questions, this field may be null.
- `answer` : The correct answer string, matching exactly one of the entries in choices. For fill-in-the-blank questions, this is a free-form answer string.
- `solution` : Step-by-step explanation or reasoning leading to the correct answer.
- `question_type` : The type of question. One of: "multi-choice" or "open".
- `level` : Difficulty level of the question. One of: "easy", "medium", or "hard".
- `subject` : The high-level scientific discipline associated with the question, e.g., "physics", "chemistry", "math", "biology".
- `sub-subject` : A finer-grained subcategory within the subject field, e.g., "Electricity" under physics.
### Modalities
This is a text + image multimodal dataset.
Each question includes:
A textual prompt (question)
A corresponding image (decoded_image)
Image is base64-encoded PNG.
Text fields are UTF-8 encoded (as per Parquet standard).
There are no audio, video, or table modalities.
## Usage Instructions
You can load the SciVQR dataset using the 🤗 datasets libra
```
from datasets import load_dataset
dataset = load_dataset("l205/SciVQR", split="train")
```
To visualize the image:
```
import base64
from PIL import Image
from io import BytesIO
img = Image.open(BytesIO(base64.b64decode(dataset[0]["decoded_image"])))
img.show()
```
提供机构:
CASIA-IVA-Lab



