harvest_v1_yolo

Name: harvest_v1_yolo
Creator: tomato-store
Published: 2026-05-14 18:57:19
License: 暂无描述

Hugging Face2026-05-14 更新2026-05-14 收录

下载链接：

https://huggingface.co/datasets/tomato-store/harvest_v1_yolo

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是使用LeRobot创建的，基于tomato-store/harvest_v1数据集的YOLO注释变体。每个视频帧都叠加了来自tomato-store/tomato-yolo11m-2模型的红色边界框（置信度≥0.5，无标签或分数），而所有元数据、parquet数据、剧集时间和特征布局与原始数据集保持一致，仅视频MP4块的像素内容被修改。数据集旨在支持将YOLO覆盖层作为视觉输入一部分的策略，适用于机器人记录、训练和推理过程。数据集包含171个剧集，共73631帧，涉及4个任务，帧率为30 FPS。特征包括动作（12个关节位置）、观察状态（12个关节位置）、三个视角的图像（左前、右前、右顶），以及时间戳、帧索引和剧集索引等元数据。许可证为Apache 2.0。

This dataset is created using LeRobot and is a YOLO-annotated variant of the tomato-store/harvest_v1 dataset. Each frame in the video streams is overlaid with red bounding boxes from the tomato-store/tomato-yolo11m-2 model (confidence ≥ 0.5, without labels or scores). All metadata, parquet data, episode timings, and feature layouts remain the same as the original dataset, with only the pixel content of the video MP4 chunks altered. The dataset is designed for strategies that treat YOLO overlays as part of the visual input, suitable for recording, training, and inference processes. It includes 171 episodes, 73,631 frames, 4 tasks, and a frame rate of 30 FPS. Features consist of actions (12 joint positions), observation states (12 joint positions), images from three perspectives (left-front, right-front, right-top), as well as metadata such as timestamps, frame indices, and episode indices. The license is Apache 2.0.

提供机构：

tomato-store

创建时间：

2026-05-14

原始信息汇总

数据集概述：harvest_v1_yolo

该数据集是 tomato-store/harvest_v1 的 YOLO 标注变体，用于机器人学习任务。

基本信息

许可证：Apache-2.0
任务类别：机器人学
创建工具：LeRobot
数据集大小：数据文件约 100 MB，视频文件约 200 MB

数据集规模

指标	数值
总片段数	171
总帧数	73,631
总任务数	4
帧率（FPS）	30 帧/秒
数据分块	每块 1,000 帧

数据集结构

数据路径：data/chunk-{chunk_index:03d}/file-{file_index:03d}.parquet
视频路径：videos/{video_key}/chunk-{chunk_index:03d}/file-{file_index:03d}.mp4
训练/测试划分：全部 171 个片段均用于训练

特征说明

动作与状态特征

机器人类型：bi_so_follower（双臂从动机器人）
特征维度：12 维（左右臂各 6 个关节）
关节包括：肩部旋转、肩部升降、肘部弯曲、腕部弯曲、腕部旋转、夹爪开合
数据类型：float32

视觉特征

数据集包含三个摄像头视角，均已叠加红色边界框（来自 tomato-store/tomato-yolo11m-2，置信度 ≥ 0.5，不包含标签/分数）：

视角名称	分辨率	格式
left_front（左前方）	480×640 像素	AV1 编码，yuv420p 格式，30 FPS
right_front（右前方）	480×640 像素	AV1 编码，yuv420p 格式，30 FPS
right_top（右上方）	360×640 像素	AV1 编码，yuv420p 格式，30 FPS

其他元数据特征

timestamp：时间戳（float32）
frame_index：帧索引（int64）
episode_index：片段索引（int64）
index：全局索引（int64）
task_index：任务索引（int64）

数据用途

该数据集专为将 YOLO 叠加层作为视觉输入一部分的策略设计，确保录制、训练和推理阶段的视觉一致性。所有元数据、parquet 数据、片段时序和特征布局均与原始数据集 harvest_v1 保持一致，仅视频像素内容因 YOLO 边界框叠加而不同。

5,000+

优质数据集

54 个

任务类型

进入经典数据集