five

blip3o-caption-mini-arrow

收藏
魔搭社区2026-01-07 更新2025-07-05 收录
下载链接:
https://modelscope.cn/datasets/prithivMLmods/blip3o-caption-mini-arrow
下载链接
链接失效反馈
官方服务:
资源简介:
# **blip3o-caption-mini-arrow** **blip3o-caption-mini-arrow** is a high-quality, curated image-caption dataset derived and optimized from the original [BLIP3o/BLIP3o-Pretrain-Long-Caption](https://huggingface.co/datasets/BLIP3o/BLIP3o-Pretrain-Long-Caption). This dataset is specifically filtered and processed for tasks involving long-form image captioning and vision-language understanding. ## Overview * **Total Samples**: 91,600 * **Modality**: Image ↔ Text * **Format**: Arrow (auto-converted to Parquet) * **License**: Apache 2.0 * **Language**: English * **Size**: \~4.5 GB ## Dataset Structure | Field | Type | Description | | ------- | ------ | ----------------------------------------------- | | image | image | Input image (stored in binary format) | | caption | string | Descriptive caption for the image (long format) | ## Quick start with Datasets🤗 ``` pip install datasets ``` ```py from datasets import load_dataset # Load the dataset dataset = load_dataset("prithivMLmods/blip3o-caption-mini-arrow", split="train") # View a sample print(dataset[0]) ``` ## Example Entries 1. **Image**: A religious statue **Caption**: *The image depicts a religious figure adorned in elaborate, ornate attire, likely a statue or icon of a saint or Virgin Mary...* 2. **Image**: A historic building with a clock tower **Caption**: *The image captures a grand, historic building under a clear blue sky. The structure features ornate architectural details...* 3. **Image**: A vibrant South Asian temple **Caption**: *The image depicts the entrance of a vibrant and ornate temple, likely of South Asian origin...* ## Use Cases This dataset is ideal for: * Training image captioning models * Evaluating visual grounding and long-text generation * Multi-modal representation learning * Fine-tuning vision-language models like BLIP, Flamingo, or IDEFICS ## Citation If you use this dataset, please consider citing the original BLIP3o dataset and linking to this derivative version.

# **blip3o-caption-mini-arrow** **blip3o-caption-mini-arrow** 是一套高质量经精选的图像-字幕(image-caption)数据集,源自原始[BLIP3o/BLIP3o-Pretrain-Long-Caption](https://huggingface.co/datasets/BLIP3o/BLIP3o-Pretrain-Long-Caption)并针对相关任务做了优化。本数据集经过专门筛选与处理,适配长文本图像字幕生成与视觉语言理解任务。 ## 概览 * **总样本量**:91,600 * **模态**:图像 ↔ 文本 * **格式**:Arrow(可自动转换为Parquet格式) * **许可协议**:Apache 2.0 * **语言**:英语 * **数据体量**:约4.5 GB ## 数据集结构 | 字段名 | 数据类型 | 描述 | | ------- | ------ | ----------------------------------------------- | | image | 图像类型 | 输入图像(以二进制格式存储) | | caption | 字符串 | 针对该图像的描述性字幕(长文本格式) | ## 快速使用(基于🤗 Datasets 库) pip install datasets py from datasets import load_dataset # 加载数据集 dataset = load_dataset("prithivMLmods/blip3o-caption-mini-arrow", split="train") # 查看单条样本 print(dataset[0]) ## 示例条目 1. **图像**:宗教雕像 **字幕**:*本图像呈现了一尊身着繁复华丽服饰的宗教造像,大概率为某位圣徒或圣母玛利亚的雕像或圣像……* 2. **图像**:带钟楼的历史建筑 **字幕**:*本图像定格了晴朗蓝天下一座宏伟的历史建筑。该结构拥有精致典雅的建筑细节……* 3. **图像**:色彩鲜活的南亚寺庙 **字幕**:*本图像展现了一座色彩明艳、装饰华丽的寺庙入口,大概率源自南亚地区……* ## 应用场景 本数据集适用于以下任务: * 训练图像字幕生成模型 * 评估视觉锚定与长文本生成能力 * 多模态表征学习 * 针对BLIP、Flamingo或IDEFICS等视觉语言模型进行微调 ## 引用说明 若您使用本数据集,请引用原始BLIP3o数据集,并链接至该衍生版本。
提供机构:
maas
创建时间:
2025-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作