Transfer learning with generative models for object detection on limited datasets
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13121949
下载链接
链接失效反馈官方服务:
资源简介:
The provided datasets are used for the analysis in the work "Transfer learning with generative models for object detection on limited datasets" (https://doi.org/10.1088/2632-2153/ad65b5). The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine. The provided datasets are built with the help of Gligen and the already existing NuImages, Ozfish and Deepfish datasets. The file "CarGenerated.zip" contains images generated with Gligen and with provided bounding boxes around cars in an urban environment. The file "fishes_on_bkg.zip" provides fish images generated with fishes from Deepfish inpainted with Gligen on generated backgrounds. The file "fish_text.zip" contains images completely generated with Gligen containing fishes with annotated bounding boxes. Finally, the file "oz_masked_512.zip" contains a simpler dataset of copy paste images of Deepfish fishes on Ozfish backrounds. All the files contains the images saved in different folders for training and validation, plus an index file called gt_fish.csv for the bounding boxes.
创建时间:
2024-07-29



