UCF-QNRF大规模人群计数数据集用于训练和评估大规模人群密集计数模型

Name: UCF-QNRF大规模人群计数数据集 用于训练和评估大规模人群密集计数模型
Creator: 帕依提提
License: 暂无描述

帕依提提2024-03-04 收录

下载链接：

https://www.payititi.com/opendatasets/show-26486.html

下载链接

链接失效反馈

官方服务：

资源简介：

UCF-QNRF 由弗罗里达大学在 2018 年发布，共包括 1535 张人群图像，其中训练集 1201 张图像，测试集 334 张图像。就注释数量而言，UCF-QNRF 是迄今为止最大的数据集，可用于训练和评估大规模人群密集计数模型。与同类数据集相比，UCF-QNRF 包含多种场景、多个视角、多种光线及密度变化的大规模已标注人体，因此非常适用于训练深度卷积神经网络。 UCF-QNRF 数据集图像均为高清大图，图像分辨率为 2013*2902。此外它还包含了建筑、植被、天空和道路等世界各地的户外真实场景，对于研究不同地区人群密度具有重要意义。概述：从社会政治和安全角度来看，在密集人群场景中自动计数和定位具有重要意义。世界各地的人群聚集在各种各样的场景中，计算参与者的数量往往是组织者和执法机构关注的一个重要问题。图1：数据集中的六幅图像我们介绍了迄今为止最大的数据集（就注释数量而言），用于培训和评估人群计数和定位方法。它包含1535幅图像，分为1201幅和334幅图像的列车和测试集。我们的数据集最适合于训练非常深的卷积神经网络（CNN），因为它比任何其他可用的人群计数数据集在密集人群场景中包含更多数量级的注释人类。我们的数据集统计和与其他数据集的比较总结如表1所示，而图1显示了从我们的数据集中随机选择的六幅图像。 UCF-QNRF数据集拥有最多的高计数人群图像和注释，以及包含最多样化的视点集、密度和照明变化的更广泛场景。与WorldExpo10和上海理工相比，该分辨率更高。平均密度，即所有图像中每像素的人数也是最低的，这意味着高质量的大图像。较低的每像素密度部分是由于包含背景区域，其中有许多高密度区域以及零密度区域。上海数据集的A部分也有高计数的人群图像，但是，它们被严重裁剪为仅包含人群。另一方面，新的UCF-QNRF数据集包含建筑、植被、天空和道路，因为它们存在于野外捕捉的真实场景中。这使得这个数据集更现实，也更困难。此外，由于我们从网络上收集数据集，而不是从监控摄像头视频或模拟人群场景中收集数据集，因此在预防性、图像分辨率、人群密度和人群存在的场景方面，数据集非常多样化。我们还特别注意确保数据集中的图像来自世界各地。图2显示了在世界地图上标记的数据集中图像的地理标记。图2：数据集中图像的位置类似地，图3（a）显示了数据集计数的多样性。数据集的分布类似于UCF_CC_50，但新数据集的图像和注释数量分别是UCF_CC_50的30倍和20倍。此外，如图3（b）所示，与WorldExpo10和ShanghaiTech相比，分辨率更高。我们希望新数据集将显著增加视觉人群分析方面的研究活动，并为构建可部署的实用人群计数和定位系统铺平道路。 If you happen to use the data set, please refer to the following paper: H. Idrees, M. Tayyab, K. Athrey, D. Zhang, S. Al-Maddeed, N. Rajpoot, M. Shah, Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds, in Proceedings of IEEE European Conference on Computer Vision (ECCV 2018), Munich, Germany, September 8-14, 2018.

UCF-QNRF was released by the University of Florida in 2018, consisting of 1535 crowd images, including 1201 training images and 334 test images. In terms of the number of annotations, UCF-QNRF is the largest dataset to date for training and evaluating large-scale crowd density counting models. Compared with similar datasets, UCF-QNRF contains a large number of annotated humans with diverse scenarios, multiple viewpoints, varying lighting conditions and density changes, making it highly suitable for training deep convolutional neural networks. All images in the UCF-QNRF dataset are high-definition with a resolution of 2013×2902. In addition, it includes real-world outdoor scenes from around the world such as buildings, vegetation, sky and roads, which is of great significance for research on crowd density in different regions. Overview: From the perspectives of socio-politics and security, automatic counting and localization in dense crowd scenarios are of great importance. Crowds gather in various scenarios worldwide, and counting the number of participants is often a critical concern for organizers and law enforcement agencies. Figure 1: Six images from the dataset We introduce the largest dataset to date (in terms of the number of annotations) for training and evaluating crowd counting and localization methods. It contains 1535 images, split into training and test sets with 1201 and 334 images respectively. Our dataset is ideally suited for training very deep convolutional neural networks (CNNs), as it contains orders of magnitude more annotated humans in dense crowd scenarios than any other available crowd counting dataset. The dataset statistics and comparisons with other datasets are summarized in Table 1, while Figure 1 shows six randomly selected images from our dataset. The UCF-QNRF dataset has the largest number of high-count crowd images and annotations, as well as a wider range of scenes with the most diverse set of viewpoints, density and lighting variations. Compared with WorldExpo'10 and ShanghaiTech, it has a higher resolution. The average density, i.e., the number of people per pixel across all images, is also the lowest, indicating high-quality large-sized images. The lower per-pixel density is partly due to the inclusion of background regions, which contain many high-density regions as well as zero-density regions. The Part A of the Shanghai dataset also has high-count crowd images, but they are severely cropped to only contain crowds. In contrast, the new UCF-QNRF dataset includes buildings, vegetation, sky and roads as they appear in real-world scenes captured in the wild, making this dataset more realistic and more challenging. Furthermore, since we collected the dataset from the web rather than from surveillance camera footage or simulated crowd scenarios, the dataset is highly diverse in terms of viewpoint, image resolution, crowd density and the scenarios where crowds exist. We also took special care to ensure that the images in the dataset were sourced from all over the world. Figure 2 shows the geotags of the images in the dataset marked on a world map. Figure 2: Locations of images in the dataset Similarly, Figure 3(a) illustrates the diversity of crowd counts in the dataset. The distribution of the dataset is similar to that of UCF_CC_50, while the numbers of images and annotations in the new dataset are 30 times and 20 times those of UCF_CC_50, respectively. Furthermore, as shown in Figure 3(b), it has a higher resolution compared with WorldExpo'10 and ShanghaiTech. We anticipate that this new dataset will significantly boost research activities in visual crowd analysis and pave the way for developing deployable, practical crowd counting and localization systems. If you happen to use this dataset, please cite the following paper: H. Idrees, M. Tayyab, K. Athrey, D. Zhang, S. Al-Maddeed, N. Rajpoot, M. Shah, Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds, in Proceedings of IEEE European Conference on Computer Vision (ECCV 2018), Munich, Germany, September 8-14, 2018.

提供机构：

帕依提提

搜集汇总

数据集介绍