five

pixmo-point-explanations

收藏
魔搭社区2025-11-27 更新2025-02-15 收录
下载链接:
https://modelscope.cn/datasets/allenai/pixmo-point-explanations
下载链接
链接失效反馈
官方服务:
资源简介:
# PixMo-Point-Explanations PixMo-Point-Explanations is a dataset of images, questions, and answers with explanations that can include in-line points that refer to parts of the image. It can be used to train vison language models to respond to questions through a mixture of text and points. PixMo-Point-Explanations is part of the [PixMo dataset collection](https://huggingface.co/collections/allenai/pixmo-674746ea613028006285687b) and was used to train the [Molmo family of models](https://huggingface.co/collections/allenai/molmo-66f379e6fe3b8ef090a8ca19) We consider this dataset experimental, while these explanations can be very informative we have also seen models can hallucinate more when generating outputs of this sort. For that reason, the Molmo models are trained to only generate outputs like this when specifically requested by prefixing input questions with "point_qa:". This mode can be used in the [Molmo demo](https://multimodal-29mpz7ym.vercel.app/share/2921825e-ef44-49fa-a6cb-1956da0be62a) Quick links: - 📃 [Paper](https://molmo.allenai.org/paper.pdf) - 🎥 [Blog with Videos](https://molmo.allenai.org/blog) ## Loading ```python data = datasets.load_dataset("allenai/pixmo-point-explanations") ``` ## Data Format Images are stored as URLs. The in-line points use a format from the LLM/annotators that does not exactly match the Molmo format. The data includes some fields derived from these responses to make them easier to parse, these fields can be null if the original response was not parsed. - `parsed_response` responses with the text "<|POINT|>" where the inline point annotations were - `alt_text` the alt text for each point annotation in the response - `inline_text` the inline text for each point annotation in the response - `points` the list-of-list of points for each point annotation ## Checking Image Hashes Image hashes are included to support double-checking that the downloaded image matches the annotated image. It can be checked like this: ```python from hashlib import sha256 import requests example = data[0] image_bytes = requests.get(example["image_url"]).content byte_hash = sha256(image_bytes).hexdigest() assert byte_hash == example["image_sha256"] ``` ## License This dataset is licensed under ODC-BY-1.0. It is intended for research and educational use in accordance with Ai2's [Responsible Use Guidelines](https://allenai.org/responsible-use). This dataset includes data generated from Claude which are subject to Anthropic [terms of service](https://www.anthropic.com/legal/commercial-terms) and [usage policy](https://www.anthropic.com/legal/aup).

# PixMo-Point-Explanations PixMo-Point-Explanations 是一个包含图像、问题、答案与解释的数据集,其解释内容可包含指向图像局部区域的行内标注点。该数据集可用于训练视觉语言模型(Vision Language Model),使其能够结合文本与标注点来响应查询任务。 PixMo-Point-Explanations 属于[PixMo数据集合集](https://huggingface.co/collections/allenai/pixmo-674746ea613028006285687b),曾被用于训练[Molmo系列模型](https://huggingface.co/collections/allenai/molmo-66f379e6fe3b8ef090a8ca19)。 本数据集为实验性质产物。尽管此类解释具备较高信息价值,但我们也观察到,模型在生成此类格式的输出时更容易产生幻觉问题。为此,Molmo系列模型仅在用户以"point_qa:"作为输入问题前缀时,才会生成此类输出。该模式可在[Molmo演示页面](https://multimodal-29mpz7ym.vercel.app/share/2921825e-ef44-49fa-a6cb-1956da0be62a)中使用。 快速链接: - 📃 [论文](https://molmo.allenai.org/paper.pdf) - 🎥 [带视频的博客](https://molmo.allenai.org/blog) ## 加载方式 python data = datasets.load_dataset("allenai/pixmo-point-explanations") ## 数据格式 图像以URL形式存储。 行内标注点采用适配大语言模型(LLM/Large Language Model)与标注人员的格式,与Molmo模型的格式不完全一致。数据集包含若干从原始响应衍生而来的字段,以简化解析流程;若原始响应未被成功解析,这些字段的值将为null。 - `parsed_response`:包含"<|POINT|>"标记的响应,该标记用于标识行内标注点的位置 - `alt_text`:响应中每个标注点对应的替代文本(alt文本) - `inline_text`:响应中每个标注点对应的行内文本 - `points`:对应每个标注点的点坐标嵌套列表 ## 图像哈希校验 数据集附带图像哈希值,用于验证下载得到的图像与标注图像完全一致。校验代码如下: python from hashlib import sha256 import requests example = data[0] image_bytes = requests.get(example["image_url"]).content byte_hash = sha256(image_bytes).hexdigest() assert byte_hash == example["image_sha256"] ## 授权协议 本数据集采用ODC-BY-1.0协议授权,仅可用于研究与教育用途,并需遵循艾伦人工智能研究所(Allen Institute for AI, Ai2)的[负责任使用指南](https://allenai.org/responsible-use)。本数据集包含由Claude生成的数据,此类数据受Anthropic的[服务条款](https://www.anthropic.com/legal/commercial-terms)与[使用政策](https://www.anthropic.com/legal/aup)约束。
提供机构:
maas
创建时间:
2025-05-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作