five

SpaceJudgeDataset

收藏
魔搭社区2025-12-05 更新2025-10-11 收录
下载链接:
https://modelscope.cn/datasets/remyxai/SpaceJudgeDataset
下载链接
链接失效反馈
官方服务:
资源简介:
![image/png](https://cdn-uploads.huggingface.co/production/uploads/647777304ae93470ffc28913/J8yEuIlTuxQ09ryz5fXmA.png) # SpaceJudge Dataset The SpaceJudge Dataset uses [prometheus-vision](https://github.com/prometheus-eval/prometheus-vision) to apply a rubric assessing the quality of response to spatial VQA inquiries on a 1-5 likert scale by prompting [SpaceLLaVA](https://huggingface.co/remyxai/SpaceLLaVA) to perform VLM-as-a-Judge. [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zOxSpMIjfWM6desF5Ai-iIk1szhlurUW?usp=sharing) The assessment is made for images in the [OpenSpaces](https://huggingface.co/datasets/remyxai/OpenSpaces) dataset in order to distill the 13B VLM judge into smaller models like [Florence-2](https://huggingface.co/collections/microsoft/florence-6669f44df0d87d9c3bfb76de) by introducing a new `<JUDGE>` task. ## Citations ``` @misc{lee2024prometheusvision, title={Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation}, author={Seongyun Lee and Seungone Kim and Sue Hyun Park and Geewook Kim and Minjoon Seo}, year={2024}, eprint={2401.06591}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```

![image/png](https://cdn-uploads.huggingface.co/production/uploads/647777304ae93470ffc28913/J8yEuIlTuxQ09ryz5fXmA.png) # SpaceJudge数据集 SpaceJudge数据集借助普罗米修斯视觉(prometheus-vision)工具,通过提示SpaceLLaVA(SpaceLLaVA)执行以视觉语言模型作为评判者(VLM-as-a-Judge)的任务,采用一套评分准则在1至5分的李克特量表上对空间视觉问答(spatial VQA)任务的回答质量进行评估。 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zOxSpMIjfWM6desF5Ai-iIk1szhlurUW?usp=sharing) 本次评估针对OpenSpaces数据集(OpenSpaces)中的图像展开,旨在通过引入全新的`<JUDGE>`任务,将130亿参数的视觉语言模型评判者知识蒸馏至诸如Florence-2(Florence-2)这类轻量化模型中。 ## 参考文献 @misc{lee2024prometheusvision, title={Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation}, author={Seongyun Lee and Seungone Kim and Sue Hyun Park and Geewook Kim and Minjoon Seo}, year={2024}, eprint={2401.06591}, archivePrefix={arXiv}, primaryClass={cs.CL} }
提供机构:
maas
创建时间:
2025-10-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作