five

Egocentric-10K

收藏
魔搭社区2026-05-24 更新2025-11-22 收录
下载链接:
https://modelscope.cn/datasets/builddotai/Egocentric-10K
下载链接
链接失效反馈
官方服务:
资源简介:
![image](https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/sFme7NIwDDD4lrNrg37yI.png) Egocentric-10K is the largest egocentric dataset. It is the first dataset collected exclusively in real factories. <video width="100%" autoplay loop muted playsinline style="border-radius: 8px;"> <source src="https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/lvA-v9UG-Xs77rd4JImJl.mp4" type="video/mp4"> Your browser does not support the video tag. </video> Egocentric-10K is state-of-the-art in hand visibility and active manipulation density compared to previous in-the-wild egocentric datasets. The complete 30,000 frame evaluation set is available at [Egocentric-10K-Evaluation](https://huggingface.co/datasets/builddotai/Egocentric-10K-Evaluation). ![image](https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/6T7TGpHO9BGK4qiEBx5f3.png) ## Dataset Statistics | Attribute | Value | |-----------|-------| | **Total Hours** | 10,000 | | **Total Frames** | 1.08 billion | | **Video Clips** | 192,900 | | **Median Clip Length** | 180.0 seconds | | **Mean Hours per Worker** | 4.68 | | **Storage Size** | 16.4 TB | | **Format** | H.265/MP4 | | **Resolution** | 1080p (1920x1080) | | **Frame Rate** | 30 fps | | **Field of View** | 128° horizontal, 67° vertical | | **Camera Type** | Monocular head-mounted | | **Audio** | No | | **Device** | Build AI Gen 1 | ## Camera Intrinsics Each worker folder contains an `intrinsics.json` file with calibrated camera parameters. The intrinsics use the **OpenCV fisheye model** (Kannala-Brandt equidistant projection) with 4 distortion coefficients (k1-k4). All values are calibrated for the 1920x1080 resolution. Example `intrinsics.json`: ```json { "model": "fisheye", "image_width": 1920, "image_height": 1080, "fx": 1030.59, "fy": 1032.82, "cx": 966.69, "cy": 539.69, "k1": -0.1166, "k2": -0.0236, "k3": 0.0694, "k4": -0.0463 } ``` ## Dataset Structure Egocentric-10K is structured in **[WebDataset format](https://huggingface.co/docs/hub/en/datasets-webdataset)**. ``` builddotai/Egocentric-10K/ ├── factory_001/ │ └── workers/ │ ├── worker_001/ │ │ ├── intrinsics.json # Camera intrinsics for this worker │ │ ├── factory001_worker001_part00.tar # Shard 0 (≤1GB) │ │ └── factory001_worker001_part01.tar # Shard 1 (if needed) │ ├── worker_002/ │ │ ├── intrinsics.json │ │ └── factory001_worker002_part00.tar │ └── worker_011/ │ ├── intrinsics.json │ └── factory001_worker011_part00.tar │ ├── factory_002/ │ └── workers/ │ ├── worker_001/ │ │ ├── intrinsics.json │ │ └── factory002_worker001_part00.tar │ └── ... │ ├── factory_003/ │ └── workers/ │ └── ... │ └── ... (factories 001-085) ``` Each TAR file contains pairs of video and metadata files: ``` factory001_worker001_part00.tar ├── factory001_worker001_00001.mp4 # Video 1 ├── factory001_worker001_00001.json # Metadata for video 1 ├── factory001_worker001_00002.mp4 # Video 2 ├── factory001_worker001_00002.json # Metadata for video 2 ├── factory001_worker001_00003.mp4 # Video 3 ├── factory001_worker001_00003.json # Metadata for video 3 └── ... # Additional video/metadata pairs ``` Each JSON metadata file has the following fields: ```json { "factory_id": "factory_002", // Unique identifier for the factory location "worker_id": "worker_002", // Unique identifier for the worker within factory "video_index": 0, // Sequential index for videos from this worker "duration_sec": 1200.0, // Video duration in seconds "width": 1920, // Video width in pixels "height": 1080, // Video height in pixels "fps": 30.0, // Frames per second "size_bytes": 599697350, // File size in bytes "codec": "h265" // Video codec } ``` ### Loading the Dataset ```python from datasets import load_dataset, Features, Value # Define features features = Features({ 'mp4': Value('binary'), 'json': { 'factory_id': Value('string'), 'worker_id': Value('string'), 'video_index': Value('int64'), 'duration_sec': Value('float64'), 'width': Value('int64'), 'height': Value('int64'), 'fps': Value('float64'), 'size_bytes': Value('int64'), 'codec': Value('string') }, '__key__': Value('string'), '__url__': Value('string') }) # Load entire dataset dataset = load_dataset( "builddotai/Egocentric-10K", streaming=True, features=features ) # Load specific factories dataset = load_dataset( "builddotai/Egocentric-10K", data_files=["factory_001/**/*.tar", "factory_002/**/*.tar"], streaming=True, features=features ) # Load specific workers dataset = load_dataset( "builddotai/Egocentric-10K", data_files=[ "factory_001/workers/worker_001/*.tar", "factory_001/workers/worker_002/*.tar" ], streaming=True, features=features ) ``` ### Loading Intrinsics ```python from huggingface_hub import hf_hub_download import json # Download intrinsics for a specific worker intrinsics_path = hf_hub_download( repo_id="builddotai/Egocentric-10K", filename="factory_001/workers/worker_001/intrinsics.json", repo_type="dataset" ) with open(intrinsics_path) as f: intrinsics = json.load(f) ``` ## License Licensed under the Apache 2.0 License. ## Citation ``` @dataset{buildaiegocentric10k2025, author = {Build AI}, title = {Egocentric-10k}, year = {2025}, publisher = {Hugging Face Datasets}, url = {https://huggingface.co/datasets/builddotai/Egocentric-10K} } ```

![image](https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/ufcEJWVUmQAyF1MqBNOOK.png) Egocentric-10K是目前规模最大的第一人称视角数据集(egocentric dataset),也是首个完全在真实工厂环境中采集的数据集。 <video width="100%" autoplay loop muted playsinline style="border-radius: 8px;"> <source src="https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/lvA-v9UG-Xs77rd4JImJl.mp4" type="video/mp4"> Your browser does not support the video tag. </video> 相较于此前的野外第一人称视角数据集,Egocentric-10K在手部可见度与主动操作密度方面处于当前最优水平。完整的30000帧评估集可在[Egocentric-10K-Evaluation](https://huggingface.co/datasets/builddotai/Egocentric-10K-Evaluation)获取。 ![image](https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/6T7TGpHO9BGK4qiEBx5f3.png) ## 数据集统计信息 | 属性 | 数值 | |-----------|-------| | **总时长** | 10,000 小时 | | **总帧数** | 10.8 亿 | | **视频片段数** | 192,900 | | **片段中位时长** | 180.0 秒 | | **参与工人数量** | 2,138 | | **单工人平均作业时长** | 4.68 小时 | | **存储总容量** | 16.4 TB | | **编码格式** | H.265/MP4 | | **分辨率** | 1080p(1920×1080) | | **帧率** | 30 fps | | **视场角** | 水平128°,垂直67° | | **相机类型** | 单目头戴式相机 | | **音频支持** | 无 | | **采集设备** | Build AI Gen 1 | ## 相机内参 每个工人对应的文件夹中均包含一个`intrinsics.json`文件,存储已校准的相机参数。该内参采用**OpenCV鱼眼模型**(Kannala-Brandt等距投影),包含4个畸变系数(k1~k4),所有参数均针对1920×1080分辨率完成校准。 示例`intrinsics.json`文件如下: json { "model": "fisheye", "image_width": 1920, "image_height": 1080, "fx": 1030.59, "fy": 1032.82, "cx": 966.69, "cy": 539.69, "k1": -0.1166, "k2": -0.0236, "k3": 0.0694, "k4": -0.0463 } ## 数据集结构 Egocentric-10K采用**WebDataset格式**(https://huggingface.co/docs/hub/en/datasets-webdataset)进行组织。 builddotai/Egocentric-10K/ ├── factory_001/ │ └── workers/ │ ├── worker_001/ │ │ ├── intrinsics.json # 该工人的相机内参文件 │ │ ├── factory001_worker001_part00.tar # 数据分片0(≤1GB) │ │ └── factory001_worker001_part01.tar # 数据分片1(按需生成) │ ├── worker_002/ │ │ ├── intrinsics.json │ │ └── factory001_worker002_part00.tar │ └── worker_011/ │ ├── intrinsics.json │ └── factory001_worker011_part00.tar │ ├── factory_002/ │ └── workers/ │ ├── worker_001/ │ │ ├── intrinsics.json │ │ └── factory002_worker001_part00.tar │ └── ... │ ├── factory_003/ │ └── workers/ │ └── ... │ └── ...(共001至085号工厂) 每个TAR数据分片包含视频与元数据文件对,示例结构如下: factory001_worker001_part00.tar ├── factory001_worker001_00001.mp4 # 视频片段1 ├── factory001_worker001_00001.json # 视频片段1的元数据 ├── factory001_worker001_00002.mp4 # 视频片段2 ├── factory001_worker001_00002.json # 视频片段2的元数据 ├── factory001_worker001_00003.mp4 # 视频片段3 ├── factory001_worker001_00003.json # 视频片段3的元数据 └── ... # 更多视频与元数据对 每个JSON元数据文件包含以下字段: json { "factory_id": "factory_002", // 工厂位置唯一标识符 "worker_id": "worker_002", // 工厂内工人唯一标识符 "video_index": 0, // 该工人产出视频的连续索引 "duration_sec": 1200.0, // 视频时长,单位:秒 "width": 1920, // 视频宽度,单位:像素 "height": 1080, // 视频高度,单位:像素 "fps": 30.0, // 帧率,单位:帧每秒 "size_bytes": 599697350, // 文件大小,单位:字节 "codec": "h265" // 视频编码格式 } ### 数据集加载方法 python from datasets import load_dataset, Features, Value # 定义数据特征 features = Features({ 'mp4': Value('binary'), 'json': { 'factory_id': Value('string'), 'worker_id': Value('string'), 'video_index': Value('int64'), 'duration_sec': Value('float64'), 'width': Value('int64'), 'height': Value('int64'), 'fps': Value('float64'), 'size_bytes': Value('int64'), 'codec': Value('string') }, '__key__': Value('string'), '__url__': Value('string') }) # 加载完整数据集 dataset = load_dataset( "builddotai/Egocentric-10K", streaming=True, features=features ) # 加载指定工厂的数据集 dataset = load_dataset( "builddotai/Egocentric-10K", data_files=["factory_001/**/*.tar", "factory_002/**/*.tar"], streaming=True, features=features ) # 加载指定工人的数据集 dataset = load_dataset( "builddotai/Egocentric-10K", data_files=[ "factory_001/workers/worker_001/*.tar", "factory_001/workers/worker_002/*.tar" ], streaming=True, features=features ) ### 内参文件加载方法 python from huggingface_hub import hf_hub_download import json # 下载指定工人的相机内参文件 intrinsics_path = hf_hub_download( repo_id="builddotai/Egocentric-10K", filename="factory_001/workers/worker_001/intrinsics.json", repo_type="dataset" ) with open(intrinsics_path) as f: intrinsics = json.load(f) ## 许可证 本数据集采用Apache 2.0许可证开源。 ## 引用格式 @dataset{buildaiegocentric10k2025, author = {Build AI}, title = {Egocentric-10k}, year = {2025}, publisher = {Hugging Face Datasets}, url = {https://huggingface.co/datasets/builddotai/Egocentric-10K} }
提供机构:
maas
创建时间:
2025-11-11
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
Egocentric-10K是目前最大的第一视角数据集,专门在真实工厂环境中采集,具有10,000小时的总时长和1.08亿帧的高质量视频数据。该数据集在手部可见性和主动操作密度方面表现优异,适用于计算机视觉和机器人技术的研究。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作