five

VisRAG-Ret-Test-InfoVQA

收藏
魔搭社区2025-12-18 更新2025-05-17 收录
下载链接:
https://modelscope.cn/datasets/OpenBMB/VisRAG-Ret-Test-InfoVQA
下载链接
链接失效反馈
官方服务:
资源简介:
## Dataset Description This is a VQA dataset based on Infographics from InfoVQA dataset from [InfoVQA](https://www.docvqa.org/datasets/infographicvqa). ### Load the dataset ```python from datasets import load_dataset import csv def load_beir_qrels(qrels_file): qrels = {} with open(qrels_file) as f: tsvreader = csv.DictReader(f, delimiter="\t") for row in tsvreader: qid = row["query-id"] pid = row["corpus-id"] rel = int(row["score"]) if qid in qrels: qrels[qid][pid] = rel else: qrels[qid] = {pid: rel} return qrels corpus_ds = load_dataset("openbmb/VisRAG-Ret-Test-InfoVQA", name="corpus", split="train") queries_ds = load_dataset("openbmb/VisRAG-Ret-Test-InfoVQA", name="queries", split="train") qrels_path = "xxxx" # path to qrels file which can be found under qrels folder in the repo. qrels = load_beir_qrels(qrels_path) ```

## 数据集说明 本数据集为一款基于InfoVQA数据集中信息图表(Infographics)构建的视觉问答(Visual Question Answering, VQA)数据集,其数据源取自[InfoVQA](https://www.docvqa.org/datasets/infographicvqa)。 ### 数据集加载方式 python from datasets import load_dataset import csv def load_beir_qrels(qrels_file): qrels = {} with open(qrels_file) as f: tsvreader = csv.DictReader(f, delimiter=" ") for row in tsvreader: qid = row["query-id"] pid = row["corpus-id"] rel = int(row["score"]) if qid in qrels: qrels[qid][pid] = rel else: qrels[qid] = {pid: rel} return qrels corpus_ds = load_dataset("openbmb/VisRAG-Ret-Test-InfoVQA", name="corpus", split="train") queries_ds = load_dataset("openbmb/VisRAG-Ret-Test-InfoVQA", name="queries", split="train") qrels_path = "xxxx" # 指向qrels文件的路径,该文件可在本仓库的qrels文件夹中获取 qrels = load_beir_qrels(qrels_path)
提供机构:
maas
创建时间:
2025-05-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作