"DroneVehicle\/FILR\/KAIST\/CVC-14"
收藏DataCite Commons2026-01-27 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/dronevehiclefilrkaistcvc-14
下载链接
链接失效反馈官方服务:
资源简介:
"Four datasets: DroneVehicle, FLIR, CVC-14, and KAIST\\begin{enumerate}\\item \\textit{DroneVehicle}: This dataset is a large-scale object detection benchmark focused on drone-captured paired infrared and visible imagery. It contains 28,439 RGB-IR image pairs at a resolution of $840 \\times 712$. Each modality has its own set of oriented bounding box annotations, covering five vehicle types: car, bus, truck, van, and freight car. To standardize the experimental protocol, we follow the methodology outlined in DPDETR~\\cite{DPDETR} to process the annotations. After preprocessing, the dataset comprises 17,990 image pairs for training and 1,469 pairs for test. In each image pair, the visible and infrared labels maintain a strictly one-to-one correspondence.\\item \\textit{KAIST}: The original KAIST Multispectral Pedestrian Dataset consists of 95,328 infrared-visible pairs at a resolution of $640 \\times 512$ captured in campus and street environments. Since the original dataset had annotation issues, we utilize the processed dataset which contains 8,593 training pairs with corrected infrared-visible annotations \\cite{zhang2019weakly} and 2,252 testing pairs with improved infrared annotations \\cite{liu2016multispectral}, following established protocols \\cite{C2former,DPDETR,DeformCAT}. In the entire test set, each annotation includes illumination, scale, and occlusion information.% The test set consists of three main subsets: Reasonable (All, Day, and Night), Scale (Near, Medium, and Far), and Occlusion (None, Partial, and Heavy).\\item \\textit{CVC-14}: This pedestrian dataset is constructed from multispectral video sequences captured by a vehicle navigating street environments at 10 frames per second, with the image resolution of $640 \\times 470$. The dataset is officially divided into a training set of 7,085 frames and a test set of 1,433 frames. It comprises ``Day'' and ``Night'' subsets, recording both visible and infrared modalities. Unlike KAIST, the annotations in CVC-14 do not have occlusion information. Following DeformCAT \\cite{DeformCAT}, we exclude images lacking valid annotations across both modalities. Consequently, the final curated dataset comprises 1,497 image pairs for training and 1,304 for testing. To obtain paired infrared-visible labels, we keep the visible labels unchanged and obtain the corresponding infrared labels following the same processing used in the DroneVehicle dataset.% A characteristic of CVC-14 is the severe spatial misalignment between the two modalities. \\item \\textit{FLIR}: The original FLIR ADAS dataset provides 14,452 RGBT image pairs captured from an on-road driving perspective. It includes annotations for three main object classes: person, car, and bicycle. We utilize the aligned version \\cite{zhang2020multispectral} with a resolution of $640 \\times 512$, which consists of 4,129 pairs for training and 1,013 pairs for testing. In this version, the majority of image pairs are well-aligned, with a small number of cases remaining misaligned. Notably, only the infrared images in this dataset are provided with annotations.%Furthermore, a detailed analysis of the misalignment in the datasets is provided in the Appendix.\\end{enumerate}"
提供机构:
IEEE DataPort
创建时间:
2026-01-27



