CRPE

Name: CRPE
Creator: maas
Published: 2025-11-12 16:19:20
License: 暂无描述

魔搭社区2025-11-12 更新2024-12-28 收录

下载链接：

https://modelscope.cn/datasets/OpenGVLab/CRPE

下载链接

链接失效反馈

官方服务：

资源简介：

# Circular-based Relation Probing Evaluation (CRPE) CRPE is a benchmark designed to quantitatively evaluate the object recognition and relation comprehension ability of models. The evaluation is formulated as single-choice questions. The benchmark consists of four splits: **Existence**, **Subject**, **Predicate**, and **Object**. The **Existence** split evaluates the object recognition ability while the remaining splits are designed to evaluate the capability of relation comprehension, focusing on probing each of the elements in the relation triplets `(subject, predicate, object)` separately. Some data examples are shown below. <img width="800" alt="image" src="https://cdn-uploads.huggingface.co/production/uploads/619507e7b74b6c591f794340/_NKaowl2OUBAjck1XCAPm.jpeg"> Additionally, to evaluate the dependency on language priors, we also include abnormal data in our evaluation. These images in these abnormal data depict relation triplets that are very rare in the real world. <img width="800" alt="image" src="https://cdn-uploads.huggingface.co/production/uploads/619507e7b74b6c591f794340/qKWw7Qb93OXClxI_VrCRk.jpeg"> For a robust evaluation, we adopt CircularEval as our evaluation strategy. Under this setting, a question is considered as correctly answered only when the model consistently predicts the correct answer in each of the N iterations, with N corresponding to the number of choices. In each iteration, a circular shift is applied to both the choices and the answer to form a new query for the model. CRPE contains the following files: - `crpe_exist.jsonl`: the evaluation data of **Existence** split. - `crpe_exist_meta.jsonl`: the evaluation data of **Existence** split without CircularEval. - `crpe_relation.jsonl`: the evaluation data of **Subject**, **Predicate**, and **Object** split. - `crpe_relation_meta.jsonl`: the evaluation data of **Subject**, **Predicate**, and **Object** split without CircularEval. **NOTE**: You should use `crpe_exist.jsonl` and `crpe_relation.jsonl` for evaluation. The evaluation script is presented [here](https://github.com/OpenGVLab/all-seeing/blob/main/all-seeing-v2/llava/eval/eval_crpe.py). See our [project](https://github.com/OpenGVLab/all-seeing/all-seeing-v2) to learn more details! # Citation If you find our work useful in your research, please consider cite: ```BibTeX @article{wang2023allseeing, title={The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World}, author={Wang, Weiyun and Shi, Min and Li, Qingyun and Wang, Wenhai and Huang, Zhenhang and Xing, Linjie and Chen, Zhe and Li, Hao and Zhu, Xizhou and Cao, Zhiguo and others}, journal={arXiv preprint arXiv:2308.01907}, year={2023} } @article{wang2024allseeing_v2, title={The All-Seeing Project V2: Towards General Relation Comprehension of the Open World}, author={Wang, Weiyun and Ren, Yiming and Luo, Haowen and Li, Tiantong and Yan, Chenxiang and Chen, Zhe and Wang, Wenhai and Li, Qingyun and Lu, Lewei and Zhu, Xizhou and others}, journal={arXiv preprint arXiv:2402.19474}, year={2024} } ```

# 基于循环的关系探测评测基准（Circular-based Relation Probing Evaluation, CRPE） CRPE是一款专为定量评测模型的**目标识别（object recognition）**与**关系理解（relation comprehension）**能力而设计的基准测试集。该评测采用单选题形式开展。该基准测试集包含四个划分子集：**存在性（Existence）**、**主体（Subject）**、**谓词（Predicate）**与**客体（Object）**。其中**存在性（Existence）**子集用于评测模型的目标识别能力，其余子集则聚焦于关系理解能力的评测，分别对**关系三元组（relation triplets）** `(subject, predicate, object)`中的各元素进行单独探测。部分数据示例如下所示。 <img width="800" alt="image" src="https://cdn-uploads.huggingface.co/production/uploads/619507e7b74b6c591f794340/_NKaowl2OUBAjck1XCAPm.jpeg"> 此外，为了评测模型对**语言先验（language priors）**的依赖程度，本次评测还加入了异常数据样本。此类异常数据的图像所描绘的关系三元组在现实世界中极为罕见。 <img width="800" alt="image" src="https://cdn-uploads.huggingface.co/production/uploads/619507e7b74b6c591f794340/qKWw7Qb93OXClxI_VrCRk.jpeg"> 为保障评测的鲁棒性，我们采用**循环评测（CircularEval）**作为评测策略。在此设置下，仅当模型在N次迭代中均能始终如一地预测出正确答案时，该问题才被视为回答正确，其中N与选项数量相等。每一次迭代中，我们都会对选项与标准答案进行循环移位，以此为模型生成新的查询请求。 CRPE包含以下文件： - `crpe_exist.jsonl`：**存在性（Existence）**子集的评测数据。 - `crpe_exist_meta.jsonl`：不使用循环评测（CircularEval）的**存在性（Existence）**子集评测数据。 - `crpe_relation.jsonl`：**主体（Subject）**、**谓词（Predicate）**与**客体（Object）**子集的评测数据。 - `crpe_relation_meta.jsonl`：不使用循环评测（CircularEval）的**主体（Subject）**、**谓词（Predicate）**与**客体（Object）**子集评测数据。 **注意**：评测时请使用`crpe_exist.jsonl`与`crpe_relation.jsonl`文件。评测脚本可参见[此处](https://github.com/OpenGVLab/all-seeing/blob/main/all-seeing-v2/llava/eval/eval_crpe.py)。如需了解更多细节，请访问我们的[项目页面](https://github.com/OpenGVLab/all-seeing/all-seeing-v2)！ # 引用声明若您的研究中用到了我们的工作，请引用以下文献： BibTeX @article{wang2023allseeing, title={全视项目：迈向开放世界的全景视觉识别与理解}, author={Wang, Weiyun and Shi, Min and Li, Qingyun and Wang, Wenhai and Huang, Zhenhang and Xing, Linjie and Chen, Zhe and Li, Hao and Zhu, Xizhou and Cao, Zhiguo and others}, journal={arXiv preprint arXiv:2308.01907}, year={2023} } @article{wang2024allseeing_v2, title={全视项目V2：迈向开放世界的通用关系理解}, author={Wang, Weiyun and Ren, Yiming and Luo, Haowen and Li, Tiantong and Yan, Chenxiang and Chen, Zhe and Wang, Wenhai and Li, Qingyun and Lu, Lewei and Zhu, Xizhou and others}, journal={arXiv preprint arXiv:2402.19474}, year={2024} }

提供机构：

maas

创建时间：

2024-12-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集