PieAPP dataset

Name: PieAPP dataset
Creator: OpenDataLab
Published: 2026-05-17 07:30:10
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/PieAPP_dataset

下载链接

链接失效反馈

官方服务：

资源简介：

PieAPP 数据集是用于训练和测试感知一致的图像错误预测算法的大规模数据集。数据集可以从以下位置下载：包含所有数据 (2.2GB) 的 zip 文件的服务器或 Google Drive（适合快速浏览）。该数据集包含未失真的高质量参考图像和这些参考图像的几个失真版本。与参考图像相对应的失真图像对用偏好标签的概率进行标记。这些标签表示认为一个图像在视觉上比对中的另一个图像更接近参考的人口比例。为了确保偏好标签的可靠成对概率，我们通过 Amazon Mechanical Turk 为每个图像对查询 40 个人类受试者。然后，我们获得选择图像 A 而不是 B 作为该对的真实标签的人的百分比，这是 A 优于 B 的偏好概率（补充文件解释了使用 40 个人类受试者来捕获准确概率的选择） .这种方法更健壮，因为它比分配质量分数更容易识别视觉上更接近的图像，并且不会受到像瑞士锦标赛这样的集合依赖性或可伸缩性问题的影响，因为我们从不使用每个图像的质量分数来标记图像（参见有关此类现有标签方案问题的相关文件和补充文件）。论文中讨论的成对学习框架可用于在 PieAPP 数据集上训练图像错误预测器。数据集统计我们将此数据集仅用于非商业和教育目的。该数据集共包含 200 张未失真的参考图像，分为训练/验证/测试拆分。这些参考图像来自滑铁卢勘探数据集。我们从滑铁卢探索数据集中发布了 PieAPP 中使用的 200 张参考图像的子集，并获得了作者的非商业、教育和使用许可。 PieAPP 数据集的用户被要求引用滑铁卢探索数据集作为参考图像，以及 PieAPP 数据集，如此处所述。训练+验证集共包含 160 张参考图像，测试集包含 40 张参考图像。总共为训练/验证集生成了 19,680 个失真图像，并提供了 77,280 个图像对的偏好标签的成对概率（来自查询 40 个人类受试者以进行成对比较 + 一些缺失对的最大似然估计）。对于测试集，每个参考创建 15 个失真图像（总共 600 个失真图像），并执行所有可能的成对比较（总共 4200 个），以使用来自 40 个人类受试者投票的偏好概率来标记每个图像对。总体而言，PieAPP 数据集提供了总共 20,280 个从 200 个参考图像派生的失真图像，以及 81,480 个成对的偏好概率标签。数据集收集的更多细节可以在论文和补充文件的第 4 节中找到。

The PieAPP dataset is a large-scale dataset for training and testing perception-aligned image error prediction algorithms. The dataset can be downloaded from the following locations: a server hosting the full 2.2GB zip file of all data, or Google Drive (ideal for quick browsing). This dataset contains undistorted high-quality reference images and several distorted versions of these reference images. Pairs of distorted images corresponding to the same reference image are labeled with pairwise preference probabilities. These labels represent the proportion of the population that perceives one image in the pair as visually closer to the reference than the other. To ensure reliable pairwise preference probabilities for the labels, we queried 40 human subjects for each image pair via Amazon Mechanical Turk. We then obtained the percentage of participants who selected Image A over Image B as the ground-truth label for the pair, which is the preference probability of A being preferred over B (the supplementary document explains the choice of using 40 human subjects to capture accurate probabilities). This approach is more robust, as it is easier to identify visually closer images than assigning quality scores, and avoids set-dependent or scalability issues similar to those in Swiss-style tournaments, since we never use per-image quality scores to label images (see the related and supplementary documents for issues with existing labeling schemes of this type). The pairwise learning framework discussed in the paper can be used to train image error predictors on the PieAPP dataset. Dataset Statistics We use this dataset solely for non-commercial and educational purposes. The dataset contains a total of 200 undistorted reference images, split into training/validation/test sets. These reference images are sourced from the Waterloo Exploration Dataset. We obtained non-commercial, educational, and usage permissions from the authors to use the subset of 200 reference images from the Waterloo Exploration Dataset utilized in PieAPP. Users of the PieAPP dataset are required to cite both the Waterloo Exploration Dataset for the reference images and the PieAPP dataset as described herein. The training+validation set contains a total of 160 reference images, while the test set contains 40 reference images. A total of 19,680 distorted images were generated for the training/validation set, and pairwise preference probabilities for 77,280 image pairs are provided (derived from querying 40 human subjects for pairwise comparisons plus maximum likelihood estimation for some missing pairs). For the test set, 15 distorted images were created per reference (totaling 600 distorted images), and all possible pairwise comparisons (totaling 4,200) were performed to label each image pair with preference probabilities derived from votes by 40 human subjects. Overall, the PieAPP dataset provides a total of 20,280 distorted images derived from 200 reference images, as well as 81,480 pairwise preference probability labels. Further details on dataset collection can be found in Section 4 of the paper and the supplementary document.

提供机构：

OpenDataLab

创建时间：

2022-05-23

搜集汇总

数据集介绍