Non-Spatially Aligned LLVIP, M3FD, and FLIR Datasets

Name: Non-Spatially Aligned LLVIP, M3FD, and FLIR Datasets
Creator: Hohai University
Published: 2025-08-30 00:00:00
License: 暂无描述

科学数据银行2025-08-30 更新2026-04-23 收录

下载链接：

https://www.scidb.cn/detail?dataSetId=22baf66dc9eb4bd8ace41ea403781c24

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset aims to simulate the inherent differences in resolution and field of view between visible light and infrared sensors in real scenes, and construct a bimodal data pair based on the original registered multimodal image dataset (including LLVIP, M3FD, and FLIR). A multimodal object detection decision fusion strategy code that includes non spatial registration datasets and non spatial registration. The data collection adopts visible light and infrared synchronous imaging equipment, and the obtained images cover different indoor and outdoor scenes. The shooting time covers daytime and nighttime lighting conditions, and the spatial resolution depends on the original sensor configuration. The visible light image maintains the original resolution, while the infrared image is adaptively cropped to form a resolution difference (the infrared image cropping ratio is controlled within the range of 15% -30% of the original image, corresponding to a final scale of about 70% -85% of the original image). The cropping operation is symmetrically performed around the infrared image to maintain the distribution of the central target in the image from being disrupted. In order to ensure the effectiveness of annotation in the cropped image, all original target box coordinates are synchronously adjusted according to the cropping offset, and annotations beyond the image boundary will be removed. The dataset contains image pairs and corresponding annotation files for multiple scenarios. Each image file is in standard RGB or grayscale image format (JPEG/PNG), and the annotation file is in XML format, recording the category, cropped coordinates, and internal position of each target in the image. Each XML file corresponds to an image and contains severalnodes. Each node records information such as target number, category, top left and bottom right coordinates, etc., measured in pixels. During data processing, some targets may be removed due to cropping beyond the boundaries, resulting in missing targets, which is an inherent characteristic of dataset design. Due to the fact that both cropping and coordinate adjustment operations are based on pixel level processing, errors mainly come from random selection of cutting amounts and rounding operations. The file structure of this dataset is clear, with each data pair containing visible light images, infrared images, and corresponding XML annotation files. The file names are consistent for automatic matching and processing. XML annotation files can be viewed and parsed using the LabelImg image annotation tool.

提供机构：

Hohai University

创建时间：

2025-08-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集