five

BRAGAN: a GAN-augmented dataset of Brazilian roadkill animals for object detection

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/ck88dwffgd
下载链接
链接失效反馈
官方服务:
资源简介:
BRAGAN is a new dataset of Brazilian wildlife developed for object detection tasks, combining real images with synthetic samples generated by Generative Adversarial Networks (GANs). It focuses on five medium and large-sized mammal species frequently involved in roadkill incidents on Brazilian highways: lowland tapir (Tapirus terrestris), jaguarundi (Herpailurus yagouaroundi), maned wolf (Chrysocyon brachyurus), puma (Puma concolor), and giant anteater (Myrmecophaga tridactyla). Its primary goal is to provide a standardized and expanded resource for biodiversity conservation research, wildlife monitoring technologies, and computer vision applications, with an emphasis on automated wildlife detection. The dataset builds upon the original BRA-Dataset by Ferrante et al. (2022), which was constructed from structured internet searches and manually curated with bounding box annotations. However, while the BRA-Dataset faced limitations in size and variability, BRAGAN introduces a new stage of dataset expansion through GAN-based synthetic image generation, substantially improving both the quantity and diversity of samples. In its final version, BRAGAN comprises approximately 9,238 images, divided into three main groups: Real images — original photographs from the BRA-Dataset. Total: 1,823. Classically augmented images — transformations applied to real samples, including rotations (RT), horizontal flips (HF), vertical flips (VF), and horizontal (HS) and vertical shifts (VS). Total: 7,300. GAN-generated images — synthetic samples created using WGAN-GP models trained separately for each species on preprocessed subsets of the original data. All generated images underwent visual inspection to ensure morphological fidelity and proper framing before inclusion. Total: 115. The dataset follows an organized directory structure with images/ and labels/ folders, each divided into train/ and val/ subsets, following an 80–20 split. Images are provided in .jpg format, while annotations follow the YOLO standard in .txt files (class_id x_center y_center width height, with normalized coordinates). The file naming convention explicitly encodes the species and the augmentation type for reproducibility. Designed to be compatible with multiple object detection architectures, BRAGAN has been evaluated on YOLOv5, YOLOv8, and YOLOv11 (variants n, s, and m), enabling the assessment of dataset expansion across different computational settings and performance requirements. By combining real data, classical augmentations, and high-quality synthetic samples, the BRAGAN provides a valuable resource for wildlife detection, environmental monitoring, and conservation research, especially in contexts where image availability for rare or threatened species is limited.

BRAGAN是专为目标检测任务打造的巴西野生动物全新数据集,结合了真实图像与由生成式对抗网络(Generative Adversarial Networks, GANs)生成的合成样本。本数据集聚焦于巴西高速公路上频繁发生道路碰撞事件的5种中大型哺乳动物:低地貘(Tapirus terrestris)、美洲獭猫(Herpailurus yagouaroundi)、鬃狼(Chrysocyon brachyurus)、美洲狮(Puma concolor)以及大食蚁兽(Myrmecophaga tridactyla)。其核心目标是为生物多样性保护研究、野生动物监测技术以及计算机视觉应用提供标准化且可扩展的资源,重点聚焦于自动化野生动物检测任务。 本数据集基于Ferrante等人2022年提出的原始BRA-Dataset构建,该数据集通过结构化互联网搜索获取数据,并经人工标注边界框完成整理。然而,原始BRA-Dataset存在样本规模与多样性不足的局限,BRAGAN则通过基于生成式对抗网络的合成图像生成技术,实现了数据集的全新扩展阶段,大幅提升了样本的数量与多样性。最终版BRAGAN数据集共包含约9238张图像,分为三大类别: 真实图像:源自BRA-Dataset的原始照片,共计1823张。 经典数据增强图像:对真实样本施加的各类变换,包括旋转(RT)、水平翻转(HF)、垂直翻转(VF)以及水平偏移(HS)与垂直偏移(VS),共计7300张。 GAN生成图像:使用针对每个物种在原始数据预处理子集上单独训练的WGAN-GP模型生成的合成样本。所有生成图像在纳入数据集前均经过人工视觉检查,以确保形态保真度与构图合理性,共计115张。 本数据集采用标准化目录结构,包含images/与labels/两个文件夹,每个文件夹均按80:20的比例划分为训练集(train/)与验证集(val/)子集。图像采用.jpg格式存储,标注文件则遵循YOLO标准,以.txt格式保存,格式为:class_id x_center y_center width height,坐标均已归一化。为保证可复现性,文件命名规则明确编码了物种与数据增强类型信息。 本数据集适配多种目标检测架构,已在YOLOv5、YOLOv8及YOLOv11(含n、s、m三个变体)上完成测试,可用于评估不同计算配置与性能需求下的数据集扩展效果。 通过整合真实数据、经典数据增强样本与高质量合成样本,BRAGAN为野生动物检测、环境监测与保护研究提供了宝贵资源,尤其适用于珍稀或濒危物种图像数据匮乏的研究场景。
创建时间:
2025-08-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作