coco 2017 文本-图像模型

Name: coco 2017 文本-图像模型
Creator: maas
Published: 2026-05-29 17:19:34
License: 暂无描述

魔搭社区2026-05-29 更新2024-05-15 收录

下载链接：

https://modelscope.cn/datasets/zacbi2023/coco2017_caption

下载链接

链接失效反馈

官方服务：

资源简介：

# coco2017 Image-text pairs from [MS COCO2017](https://cocodataset.org/#download). ## Data origin * Data originates from [cocodataset.org](http://images.cocodataset.org/annotations/annotations_trainval2017.zip) * While `coco-karpathy` uses a dense format (with several sentences and sendids per row), `coco-karpathy-long` uses a long format with one `sentence` (aka caption) and `sendid` per row. `coco-karpathy-long` uses the first five sentences and therefore is five times as long as `coco-karpathy`. * `phiyodr/coco2017`: One row corresponds one image with several sentences. * `phiyodr/coco2017-long`: One row correspond one sentence (aka caption). There are 5 rows (sometimes more) with the same image details. ## Format ```python DatasetDict({ train: Dataset({ features: ['license', 'file_name', 'coco_url', 'height', 'width', 'date_captured', 'flickr_url', 'image_id', 'ids', 'captions'], num_rows: 118287 }) validation: Dataset({ features: ['license', 'file_name', 'coco_url', 'height', 'width', 'date_captured', 'flickr_url', 'image_id', 'ids', 'captions'], num_rows: 5000 }) }) ``` ## Usage * Download image data and unzip ```bash cd PATH_TO_IMAGE_FOLDER wget http://images.cocodataset.org/zips/train2017.zip wget http://images.cocodataset.org/zips/val2017.zip #wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip # zip not needed: everything you need is in load_dataset("phiyodr/coco2017") unzip train2017.zip unzip val2017.zip ``` * Load dataset in Python ```python import os from datasets import load_dataset PATH_TO_IMAGE_FOLDER = "COCO2017" def create_full_path(example): """Create full path to image using `base_path` to COCO2017 folder.""" example["image_path"] = os.path.join(PATH_TO_IMAGE_FOLDER, example["file_name"]) return example dataset = load_dataset("phiyodr/coco2017") dataset = dataset.map(create_full_path) ```

# COCO2017 本数据集包含来自[MS COCO2017](https://cocodataset.org/#download)的图像-文本配对样本。 ## 数据来源 * 数据集源文件来自[cocodataset.org](http://images.cocodataset.org/annotations/annotations_trainval2017.zip) * `coco-karpathy`采用稠密格式（每行包含多条语句及`sendid`），而`coco-karpathy-long`则采用长格式，每行仅包含一条`sentence`（即图像标题`caption`）与`sendid`。`coco-karpathy-long`选取每条图像对应的前5条标题，因此数据量是`coco-karpathy`的5倍。 * `phiyodr/coco2017`：每行对应一张图像，附带多条标题语句。 * `phiyodr/coco2017-long`：每行仅对应一条标题语句（即图像标题`caption`）。对于同一张图像，将生成5行（有时更多）包含完全相同图像元信息的条目。 ## 数据格式 python DatasetDict({ train: Dataset({ features: ['license', 'file_name', 'coco_url', 'height', 'width', 'date_captured', 'flickr_url', 'image_id', 'ids', 'captions'], num_rows: 118287 }) validation: Dataset({ features: ['license', 'file_name', 'coco_url', 'height', 'width', 'date_captured', 'flickr_url', 'image_id', 'ids', 'captions'], num_rows: 5000 }) }) ## 使用方法 * 下载图像数据并解压 bash cd PATH_TO_IMAGE_FOLDER wget http://images.cocodataset.org/zips/train2017.zip wget http://images.cocodataset.org/zips/val2017.zip #wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip # 无需下载该标注压缩包：所需全部内容均可通过load_dataset("phiyodr/coco2017")获取 unzip train2017.zip unzip val2017.zip * 在Python中加载数据集 python import os from datasets import load_dataset PATH_TO_IMAGE_FOLDER = "COCO2017" def create_full_path(example): """使用COCO2017数据集文件夹的路径生成图像完整路径。""" example["image_path"] = os.path.join(PATH_TO_IMAGE_FOLDER, example["file_name"]) return example dataset = load_dataset("phiyodr/coco2017") dataset = dataset.map(create_full_path)

提供机构：

maas

创建时间：

2024-04-04

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集基于MS COCO2017，提供图像-文本对，用于图像描述任务。它包含训练集和验证集，分别有118,287和5,000个样本，每个样本包含图像文件、URL、尺寸和标注信息。用户可通过下载图像文件并加载数据集来使用。

以上内容由遇见数据集搜集并总结生成