pixmo-ask-model-anything

Name: pixmo-ask-model-anything
Creator: maas
Published: 2025-12-05 16:36:14
License: 暂无描述

魔搭社区2025-12-05 更新2025-02-15 收录

下载链接：

https://modelscope.cn/datasets/allenai/pixmo-ask-model-anything

下载链接

链接失效反馈

官方服务：

资源简介：

# PixMo-AskModelAnything PixMo-AskModelAnything is an instruction-tuning dataset for vision-language models. It contains human-authored question-answer pairs about diverse images with long-form answers. PixMo-AskModelAnything is a part of the [PixMo dataset collection](https://huggingface.co/collections/allenai/pixmo-674746ea613028006285687b) and was used to train the [Molmo family of models](https://huggingface.co/collections/allenai/molmo-66f379e6fe3b8ef090a8ca19) Quick links: - 📃 [Paper](https://molmo.allenai.org/paper.pdf) - 🎥 [Blog with Videos](https://molmo.allenai.org/blog) ## Loading ```python data = datasets.load_dataset("allenai/pixmo-ask-model-anything", split="train") ``` ## Data Format Each row contains an image URL and a Q/A pair. Note the image URLs can be repeated since many images have multiple Q/A pairs. ## Image Checking Image hashes are included to support double-checking that the downloaded image matches the annotated image. It can be checked like this: ```python from hashlib import sha256 import requests example = data[0] image_bytes = requests.get(example["image_url"]).content byte_hash = sha256(image_bytes).hexdigest() assert byte_hash == example["image_sha256"] ``` ## License This dataset is licensed under ODC-BY-1.0. It is intended for research and educational use in accordance with Ai2's [Responsible Use Guidelines](https://allenai.org/responsible-use). This dataset includes data generated from Claude which are subject to Anthropic [terms of service](https://www.anthropic.com/legal/commercial-terms) and [usage policy](https://www.anthropic.com/legal/aup).

# PixMo-AskModelAnything PixMo-AskModelAnything是一款面向视觉语言模型（vision-language model）的指令微调数据集，收录了人类撰写的、针对多样化图像的长文本问答对。 PixMo-AskModelAnything隶属于[PixMo数据集合集](https://huggingface.co/collections/allenai/pixmo-674746ea613028006285687b)，曾被用于训练[Molmo系列模型](https://huggingface.co/collections/allenai/molmo-66f379e6fe3b8ef090a8ca19)。快速访问链接： - 📃 [研究论文](https://molmo.allenai.org/paper.pdf) - 🎥 [带视频的博客文章](https://molmo.allenai.org/blog) ## 加载方式 python data = datasets.load_dataset("allenai/pixmo-ask-model-anything", split="train") ## 数据格式每条数据样本包含一张图像的URL与一组问答对。请注意，由于单张图像可对应多组问答对，因此图像URL可能会重复出现。 ## 图像校验为支持验证下载图像与标注图像的一致性，数据集中附带了图像的SHA256哈希值。校验代码示例如下： python from hashlib import sha256 import requests example = data[0] image_bytes = requests.get(example["image_url"]).content byte_hash = sha256(image_bytes).hexdigest() assert byte_hash == example["image_sha256"] ## 授权协议本数据集采用ODC-BY-1.0协议进行授权，仅可用于研究与教育用途，并需遵循AI2的[负责任使用指南](https://allenai.org/responsible-use)。本数据集包含由Claude生成的数据，此类数据需遵守Anthropic的[服务条款](https://www.anthropic.com/legal/commercial-terms)与[使用政策](https://www.anthropic.com/legal/aup)。

提供机构：

maas

创建时间：

2025-05-27

搜集汇总

数据集介绍