ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision

Name: ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision
Creator: city.figshare.com
Published: 2023-05-31 00:00:00
License: 暂无描述

city.figshare.com2023-05-31 更新2025-03-23 收录

下载链接：

https://city.figshare.com/articles/dataset/ORBIT_A_real-world_few-shot_dataset_for_teachable_object_recognition_collected_from_people_who_are_blind_or_low_vision/14294597/3

下载链接

链接失效反馈

官方服务：

资源简介：

Object recognition predominately still relies on many high-quality training examples per object category. In contrast, learning new objects from only a few examples could enable many impactful applications from robotics to user personalization. Most few-shot learning research, however, has been driven by benchmark datasets that lack the high variation that these applications will face when deployed in the real-world. To close this gap, we present the ORBIT dataset, grounded in a real-world application of teachable object recognizers for people who are blind/low vision. We provide a full, unfiltered dataset of 4,733 videos of 588 objects recorded by 97 people who are blind/low-vision on their mobile phones, and a benchmark dataset of 3,822 videos of 486 objects collected by 77 collectors. The code for loading the dataset, computing all benchmark metrics, and running the baseline models is available at https://github.com/microsoft/ORBIT-DatasetThis version comprises several zip files:- train, validation, test: benchmark dataset, organised by collector, with raw videos split into static individual frames in jpg format at 30FPS- other: data not in the benchmark set, organised by collector, with raw videos split into static individual frames in jpg format at 30FPS (please note that the train, validation, test, and other files make up the unfiltered dataset)- *_224: as for the benchmark, but static individual frames are scaled down to 224 pixels.- *_unfiltered_videos: full unfiltered dataset, organised by collector, in mp4 format.

物体识别主要仍依赖于每个对象类别的大量高质量训练样本。相比之下，仅从少量样本中学习新对象，便能激发从机器人技术到用户个性化等众多具有深远影响的应用。然而，大多数少样本学习研究均由缺乏实际应用中所面临的高度变化的基准数据集所驱动。为弥合这一差距，我们推出了 ORBIT 数据集，该数据集源于教盲/低视力人群使用的可教授物体识别器的实际应用场景。我们提供了包含 4,733 个视频、588 个对象的完整、未过滤数据集，这些视频由 97 位盲/低视力人士使用手机录制。此外，我们还提供了包含 3,822 个视频、486 个对象的基准数据集，这些视频由 77 位收集者收集。数据集加载、计算所有基准指标以及运行基线模型的代码可在 https://github.com/microsoft/ORBIT-Dataset 找到。本版本包含多个压缩文件：- train、validation、test：基准数据集，按收集者组织，原始视频已分割为静态独立帧，以 jpg 格式存储，帧率为 30FPS。- other：基准数据集之外的数据，按收集者组织，原始视频已分割为静态独立帧，以 jpg 格式存储，帧率为 30FPS（请注意，train、validation、test 和 other 文件组成了未过滤的数据集）。- *_224：与基准数据集类似，但静态独立帧已缩放到 224 像素。- *_unfiltered_videos：完整的未过滤数据集，按收集者组织，以 mp4 格式存储。

提供机构：

city.figshare.com