OPA_composite

Name: OPA_composite
Creator: maas
Published: 2025-12-05 16:42:51
License: 暂无描述

魔搭社区2025-12-05 更新2025-07-26 收录

下载链接：

https://modelscope.cn/datasets/gokaygokay/OPA_composite

下载链接

链接失效反馈

官方服务：

资源简介：

## Dataset Card for new_OPA_dataset ### Dataset Summary The **Object-Placement-Assessment (OPA)** dataset is a synthesized dataset designed for evaluating the rationality of object placement in composite images. It is based on the COCO dataset, featuring foreground objects pasted onto compatible background images with varying sizes and locations. Each composite image is annotated with a binary label (0 or 1) indicating whether the object placement is reasonable, considering factors like location, size, occlusion, semantics, and perspective. This dataset contains 62,074 training images and 11,396 test images, with no overlap in foregrounds or backgrounds between the splits. It is ideal for tasks such as object placement prediction, image composition, and visual common sense reasoning. ### Supported Tasks and Leaderboards - **Task**: Object Placement Assessment - **Description**: Predict whether a composite image’s object placement is rational (label 1) or irrational (label 0) based on visual and semantic cues. - **Leaderboard**: No public leaderboard is currently available. ### Languages - The dataset includes image data with English metadata (e.g., category names, file names). ## Dataset Structure ### Data Instances Each instance represents a composite image with its associated mask and metadata. An example from the train split looks like: ```python { 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=640x480>, 'mask': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=640x480>, 'fg_id': '4535', 'bg_id': '367260', 'position': '[482, 281, 151, 126]', 'scale': 0.26269999146461487, 'label': 0, 'image_filename': '4535_367260_482_281_151_126_0.2627_0.jpg', 'mask_filename': 'mask_4535_367260_482_281_151_126_0.2627_0.jpg' } ``` ### Data Fields - image: The composite image (JPEG, typically 640x480 pixels). - mask: The mask of the foreground object in the composite image (JPEG, same size as the image). - fg_id: Identifier for the foreground object (string). - bg_id: Identifier for the background image (string). - position: Bounding box of the foreground object in the format [x, y, w, h], where x, y is the upper-left corner, and w, h are width and height (string). - scale: The maximum ratio of foreground width/height to background width/height (float32). - label: Binary label indicating placement rationality (0 for irrational, 1 for rational, int32). - image_filename: File name of the composite image (string). - mask_filename: File name of the mask image (string). ### Data Splits - Train: 62,074 images (21,376 positive, 40,698 negative samples). - Test: 11,396 images (3,588 positive, 7,808 negative samples). There is no overlap in foregrounds (2,701 unique in train, 1,436 in test) or backgrounds (1,236 unique in train, 153 in test) between splits. ## Dataset Creation ### Curation Rationale The OPA dataset was created to support research in object placement assessment, addressing challenges in image composition by evaluating the plausibility of foreground object placements. It was synthesized from COCO by selecting unoccluded objects, pasting them onto compatible backgrounds, and annotating rationality via human labeling. ### Source Data - **Initial Data Collection**: Derived from the COCO dataset, with foreground objects and backgrounds curated for compatibility. - **Annotation Process**: Composite images were generated with random sizes and locations, then labeled by human annotators for rationality. ### Annotations - **Annotation Types**: Binary rationality labels (0 or 1), bounding box positions, and scale values. - **Process**: Human annotators assessed each composite image for placement plausibility based on location, size, occlusion, semantics, and perspective. ## Considerations for Using the Data ### Social Impact of Dataset This dataset advances research in image composition and visual reasoning, with applications in augmented reality, content creation, and automated design. However, biases in the COCO dataset (e.g., underrepresentation of certain scenes or objects) may carry over. ### Discussion of Biases As a derivative of COCO, the dataset may inherit biases related to object categories, scene diversity, or cultural contexts. Users should evaluate its suitability for specific applications. ### Other Known Limitations - The dataset focuses on binary rationality labels, which may not capture nuanced placement quality. - Negative samples include specific issues (e.g., inappropriate size, occlusion), but the dataset does not categorize these issues explicitly. ## Additional Information ### Dataset Curators The OPA dataset was created by Liu Liu, Zhenchen Liu, Bo Zhang, Jiangtong Li, Li Niu, Qingyang Liu, and Liqing Zhang from the Brain Cognition & Machine Intelligence Lab (BCMI). ### Licensing Information The dataset is released for research purposes. As a derivative of COCO, it inherits COCO’s licensing terms (CC BY 4.0). Users should verify compliance with COCO’s license. ### Citation Information If you use this dataset, please cite the original OPA paper: ``` @article{liu2021OPA, title={OPA: Object Placement Assessment Dataset}, author={Liu, Liu and Liu, Zhenchen and Zhang, Bo and Li, Jiangtong and Niu, Li and Liu, Qingyang and Zhang, Liqing}, journal={arXiv preprint arXiv:2107.01889}, year={2021} } ```

## new_OPA_dataset 数据集卡片 ### 数据集概述 **物体放置评估（Object-Placement-Assessment，OPA）** 数据集是一款合成数据集，旨在评估合成图像中物体放置的合理性。该数据集基于COCO（Common Objects in Context）数据集构建，将前景物体粘贴至适配的背景图像中，且前景物体的尺寸与位置均存在差异。每张合成图像均会被标注二元标签（0或1），用以指示物体放置是否合理，标注需综合考量位置、尺寸、遮挡、语义以及透视等因素。该数据集包含62074张训练图像与11396张测试图像，且训练集与测试集的前景物体与背景图像均无重叠。其非常适用于物体放置预测、图像合成以及视觉常识推理等相关任务。 ### 支持任务与排行榜 - **任务**：物体放置评估 - **任务描述**：基于视觉与语义线索，预测合成图像的物体放置是否合理（标签1代表合理，标签0代表不合理）。 - **排行榜**：目前暂无公开排行榜。 ### 语言说明该数据集包含图像数据，其元数据为英文（例如类别名称、文件名）。 ## 数据集结构 ### 数据实例每个数据实例对应一张合成图像，以及与之关联的掩码图像与元数据。训练集的一个示例如下： python { 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=640x480>, 'mask': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=640x480>, 'fg_id': '4535', 'bg_id': '367260', 'position': '[482, 281, 151, 126]', 'scale': 0.26269999146461487, 'label': 0, 'image_filename': '4535_367260_482_281_151_126_0.2627_0.jpg', 'mask_filename': 'mask_4535_367260_482_281_151_126_0.2627_0.jpg' } ### 数据字段 - `image`：合成图像（格式为JPEG（Joint Photographic Experts Group），分辨率通常为640×480像素）。 - `mask`：合成图像中前景物体的掩码图像（格式为JPEG（Joint Photographic Experts Group），与合成图像分辨率一致）。 - `fg_id`：前景物体的唯一标识符（字符串类型）。 - `bg_id`：背景图像的唯一标识符（字符串类型）。 - `position`：前景物体的边界框，格式为`[x, y, w, h]`，其中`x`、`y`为左上角坐标，`w`、`h`分别为边界框的宽与高（字符串类型）。 - `scale`：前景物体宽/高与背景图像宽/高的最大比值（float32类型）。 - `label`：指示放置合理性的二元标签（0代表不合理，1代表合理，int32类型）。 - `image_filename`：合成图像的文件名（字符串类型）。 - `mask_filename`：掩码图像的文件名（字符串类型）。 ### 数据划分 - 训练集：62074张图像（其中正样本21376张，负样本40698张）。 - 测试集：11396张图像（其中正样本3588张，负样本7808张）。各数据划分之间的前景物体与背景图像均无重叠（训练集包含2701个独特前景物体、1236张独特背景图像；测试集包含1436个独特前景物体、153张独特背景图像）。 ## 数据集构建 ### 构建依据 OPA数据集的构建旨在支撑物体放置评估相关研究，通过评估前景物体放置的合理性来解决图像合成领域的现有挑战。该数据集基于COCO数据集合成：首先筛选无遮挡的物体，将其粘贴至适配的背景图像中，再通过人工标注的方式为样本赋予合理性标签。 ### 源数据 - **初始数据采集**：该数据集源自COCO数据集，其中前景物体与背景图像均经过筛选以确保适配性。 - **标注流程**：通过随机调整前景物体的尺寸与位置生成合成图像，再由人工标注员评估图像的放置合理性并赋予标签。 ### 标注信息 - **标注类型**：二元合理性标签（0或1）、前景物体边界框位置以及尺寸比例值。 - **标注流程**：人工标注员需综合考量位置、尺寸、遮挡、语义以及透视等因素，对每张合成图像的放置合理性进行评估。 ## 数据集使用注意事项 ### 数据集的社会影响该数据集推动了图像合成与视觉推理领域的研究，可应用于增强现实、内容创作以及自动化设计等场景。但该数据集会继承COCO数据集本身存在的偏见（例如部分场景或物体的占比不足）。 ### 偏见相关讨论作为COCO数据集的衍生数据集，OPA数据集可能会继承与物体类别、场景多样性或文化背景相关的偏见。使用者需根据具体应用场景评估该数据集的适配性。 ### 已知其他局限性 - 该数据集仅采用二元合理性标签，无法反映物体放置质量的细微差异。 - 负样本包含多种具体问题（例如尺寸不当、存在遮挡），但数据集未对这些问题进行明确分类。 ## 附加信息 ### 数据集构建者 OPA数据集由脑认知与机器智能实验室（Brain Cognition & Machine Intelligence Lab，BCMI）的Liu Liu、Zhenchen Liu、Bo Zhang、Jiangtong Li、Li Niu、Qingyang Liu以及Liqing Zhang构建。 ### 许可信息该数据集仅用于研究目的。作为COCO数据集的衍生数据集，其许可条款继承自COCO（CC BY 4.0协议）。使用者需确认自身使用行为符合COCO数据集的许可要求。 ### 引用信息若您使用该数据集，请引用以下OPA原论文： @article{liu2021OPA, title={OPA: Object Placement Assessment Dataset}, author={Liu, Liu and Liu, Zhenchen and Zhang, Bo and Li, Jiangtong and Niu, Li and Liu, Qingyang and Zhang, Liqing}, journal={arXiv preprint arXiv:2107.01889}, year={2021} }

提供机构：

maas

创建时间：

2025-07-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集