Synthetic dataset from - Bar et al., Sifting through the haystack - efficiently finding rare behaviors in large-scale datasets, WACV 2025
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14266406
下载链接
链接失效反馈官方服务:
资源简介:
This is a synthetic dataset emulating pose estimation data, introduced in the associated paper.
Briefly, each sample in the dataset is a sequence of 5-keypoints with 9 timesteps. Movement of each keypoint in time is determined by a sinus with some amplitude A and some frequency f, this is loosely inspired by larval zebrafish swimming movement. There are two types of behaviors - a common behavior (aka forming the majority of the samples in the dataset) where the frequency of the sinus is larger than the amplitude, and a rare one where the amplitude is larger than the frequency. We vary the similarity between the rare and common behaviors by relaxing the standard deviation of the gaussian from which we draw these movement parameters (behavior similarity, sd= [0.5, 1.5, 2.5, 5]). We also test different levels of data imbalance, varying the frequency of the rare behavior (rarity=[1.5%,5%,12%,24%]).
Thus we created 16 datasets with all possible combinations.
The data generation code will become available in our code repository: https://github.com/shir3bar/SiftingTheHaystack
The data was used to create a controlled experimental sandbox in which we could test our pipeline for detecting rare behaviors. Sounds interesting? Read our paper and check out the code :)
创建时间:
2024-12-05



