five

WILDTRACK:用于密集无脚本行人检测的多摄像机高清数据集

收藏
帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-26514.html
下载链接
链接失效反馈
官方服务:
资源简介:
“WILDTRACK”数据集的挑战性和现实性设置将多摄像机检测和跟踪方法引入了野外。它满足了大规模多摄像头步行者数据集的深度学习方法的需要,其中摄像头的视野在很大程度上重叠。它被当前的高科技硬件收购,提供高清分辨率的数据。此外,其高精度联合校准和同步应允许开发新的算法,超出当前可用数据集的可能范围。 数据采集发生在瑞士苏黎世ETH主楼前,当时天气良好。这些序列的分辨率为1920×1080像素,以每秒60帧的速度拍摄。 The challenging and realistic setup of the ‘WILDTRACK‘ dataset brings multi-camera detection and tracking methods into the wild. It meets the need of the deep learning methods for a large-scale multi-camera dataset of walking pedestrians, where the cameras’ fields of view in large part overlap. Being acquired by current high tech hardware it provides HD resolution data. Further, its high precision joint calibration and synchronization shall allow for development of new algorithms that go beyond what is possible with currently available data-sets. The data acquisition took place in front of the main building of ETH Zurich, Switzerland, during nice weather conditions. The sequences are of resolution 1920×1080 pixels, shot at 60 frames per second. Synchronized frames extracted with a frame rate of 10 fps, 1920×1080 resolution, and which are post-processed to remove the distortion; Calibration files which use the Pinhole camera model, compatible with the projection functions provided in the OpenCV library. Both the extrinsic and the intrinsic calibrations are available; The ground-truth annotations in a ‘json’ file format (please see separate section bellow); For ease in usage for methods focusing on classification, we also provide a file we refer to as ‘positions’ file in ‘json’ file format. For details please refer to the section bellow. Please check for an update of this site, which shell extend the download list with: Corresponding points annotations which may be used for camera calibration algorithms; A second part of this dataset which albeit not being annotated, can be used for unsupervised methods. The ‘positions file’ allows for omitting the work with calibration files and focusing for instance on classification, while making use of the fact that the cameras are static. It consists of information about where exactly a given set of particular volumes of space project to in all of the views. The height of each volume space corresponds to the one of an average person’s height. We discretize the ground surface as a regular grid. The 3D space occupied if a person is standing at a particular position is modelled by a cylinder positioned centrally on the grid point. Each cylinder projects into each of the separate 2D views as a rectangle whose position in the view is given in pixel coordinates. Using a 480×1440 grid – totalling into 691200 positions – and the provided camera calibration files, we yield such file which is available for download. Each position is assigned an ID using 0-based enumeration ([0, 691199]). The views’ ordering numbers in this file also follow such enumeration, i.e. they range between 0 and 6 inclusively. The positions which are not visible in a given view are assigned coordinates of -1. Full ground truth annotations are provided for 400 frames using a frame rate of 2fps. On average, there are 20 persons on each frame. Thus, our dataset provides approximately 400x20x7=56,000 single-view bounding boxes. By interpolating, the annotations’ size can be further increased. This annotations were generated through workers hired on Amazon Mechanical Turk. Note that the annotations roughly correspond to the coordinates of the above-elaborated position file and thus include the ID of the annotated position which is estimated to be occupied by the specific target. These position IDs are in accordance with the provided positions file. This work was supported by the Swiss National Science Foundation, under the grant CRSII2-147693 ”WILDTRACK”. WILDTRACK: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection T. Chavdarova; P. Baqué; A. Maksai; S. Bouquet; C. Jose et al. Computer Vision and Pattern Recognition, 2018, 10.1109/CVPR.2018.00528. URL: https://www.epfl.ch/labs/cvlab/data/data-wildtrack/ License: No license specified, the work may be protected by copyright. Bibtex:

WILDTRACK数据集采用兼具挑战性与现实性的设置,将多摄像机检测与跟踪方法推向真实野外场景。该数据集满足了深度学习方法对大规模多摄像头行人数据集的需求——此类数据集的摄像头视场(Field of View, FoV)存在大面积重叠。其采用当前高端硬件采集,提供高清(High Definition, HD)分辨率数据;此外,高精度联合校准与同步机制可支持开发超越现有数据集限制的新型算法。 数据采集于瑞士苏黎世联邦理工学院(ETH Zurich)主楼前的室外场景,采集时天气状况良好。原始序列分辨率为1920×1080像素,采集帧率为60帧每秒。 本次发布的数据集包含以下内容: 1. 以10帧每秒的帧率提取的同步帧,分辨率保持1920×1080像素,且经过后处理以消除畸变; 2. 采用针孔相机模型(Pinhole camera model)的校准文件,与开放源代码计算机视觉库(Open Source Computer Vision Library, OpenCV)提供的投影函数兼容,同时提供外参与内参校准数据; 3. 采用JavaScript对象表示法(JSON)格式存储的地面真值注释(详细说明见后文独立章节); 4. 为便于聚焦分类任务的方法使用,我们额外提供了名为"positions"的JSON格式文件,具体细节详见后文。 敬请关注本页面更新,后续将扩展下载资源列表,新增: - 可用于相机校准算法的对应点注释; - 数据集的第二部分:虽未标注,但可用于无监督(unsupervised)方法研究。 "positions文件"可无需处理校准文件,直接聚焦分类等任务——得益于摄像头固定的特性。该文件包含特定空间体积在所有视图中的精确投影位置信息:每个空间体积的高度与普通成年人身高一致。我们将地面离散为规则网格,将行人站立于某一网格点时占据的三维空间建模为以该点为中心的圆柱体。每个圆柱体在各独立二维视图中投影为矩形,其在视图中的位置以像素坐标表示。我们采用480×1440的网格(总计691200个位置)结合提供的相机校准文件生成了该文件,可供下载。每个位置均采用从0开始的枚举编号(范围为[0, 691199]),文件中的视图序号也遵循此编号规则,即视图编号范围为0至6(含两端)。在某一视图中不可见的位置,其坐标将被赋值为-1。 数据集为400帧图像提供了完整的地面真值注释,注释帧率为2帧每秒。平均每帧图像中包含20名行人,因此本数据集共提供约400×20×7=56000个单视图边界框(bounding box)。通过插值操作可进一步扩充注释数据规模。 本次注释由亚马逊机械 Turk(Amazon Mechanical Turk)平台雇佣的标注人员完成。请注意,本次注释与前文所述的"positions"文件的坐标大致对应,且包含了对应目标所占据的标注位置ID,该位置ID与提供的"positions"文件完全一致。 本项目得到瑞士国家科学基金会(Swiss National Science Foundation, SNSF)资助,资助编号为CRSII2-147693,项目名称为"WILDTRACK"。 ### 论文与资源信息 论文标题:WILDTRACK: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection 作者:T. Chavdarova; P. Baqué; A. Maksai; S. Bouquet; C. Jose 等 发表会议:计算机视觉与模式识别会议(Conference on Computer Vision and Pattern Recognition, CVPR) 2018,DOI: 10.1109/CVPR.2018.00528 数据集链接:https://www.epfl.ch/labs/cvlab/data/data-wildtrack/ 授权说明:未指定授权协议,本作品可能受版权保护 Bibtex格式引用:(未提供完整内容)
提供机构:
帕依提提
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
WILDTRACK是一个多摄像机高清数据集,专为密集无脚本行人检测设计,提供高分辨率视频序列和详细的地面真实标注。数据集采集于瑞士苏黎世ETH主楼前,适用于开发和测试多摄像机检测和跟踪算法。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务