Egocentric-10K

Name: Egocentric-10K
Creator: maas
Published: 2026-05-24 01:40:18
License: 暂无描述

魔搭社区2026-05-24 更新2025-11-22 收录

下载链接：

https://modelscope.cn/datasets/builddotai/Egocentric-10K

下载链接

链接失效反馈

官方服务：

资源简介：

![image](https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/sFme7NIwDDD4lrNrg37yI.png) Egocentric-10K is the largest egocentric dataset. It is the first dataset collected exclusively in real factories. <video width="100%" autoplay loop muted playsinline style="border-radius: 8px;"> <source src="https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/lvA-v9UG-Xs77rd4JImJl.mp4" type="video/mp4"> Your browser does not support the video tag. </video> Egocentric-10K is state-of-the-art in hand visibility and active manipulation density compared to previous in-the-wild egocentric datasets. The complete 30,000 frame evaluation set is available at [Egocentric-10K-Evaluation](https://huggingface.co/datasets/builddotai/Egocentric-10K-Evaluation). ![image](https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/6T7TGpHO9BGK4qiEBx5f3.png) ## Dataset Statistics | Attribute | Value | |-----------|-------| | **Total Hours** | 10,000 | | **Total Frames** | 1.08 billion | | **Video Clips** | 192,900 | | **Median Clip Length** | 180.0 seconds | | **Mean Hours per Worker** | 4.68 | | **Storage Size** | 16.4 TB | | **Format** | H.265/MP4 | | **Resolution** | 1080p (1920x1080) | | **Frame Rate** | 30 fps | | **Field of View** | 128° horizontal, 67° vertical | | **Camera Type** | Monocular head-mounted | | **Audio** | No | | **Device** | Build AI Gen 1 | ## Camera Intrinsics Each worker folder contains an `intrinsics.json` file with calibrated camera parameters. The intrinsics use the **OpenCV fisheye model** (Kannala-Brandt equidistant projection) with 4 distortion coefficients (k1-k4). All values are calibrated for the 1920x1080 resolution. Example `intrinsics.json`: ```json { "model": "fisheye", "image_width": 1920, "image_height": 1080, "fx": 1030.59, "fy": 1032.82, "cx": 966.69, "cy": 539.69, "k1": -0.1166, "k2": -0.0236, "k3": 0.0694, "k4": -0.0463 } ``` ## Dataset Structure Egocentric-10K is structured in **[WebDataset format](https://huggingface.co/docs/hub/en/datasets-webdataset)**. ``` builddotai/Egocentric-10K/ ├── factory_001/ │ └── workers/ │ ├── worker_001/ │ │ ├── intrinsics.json # Camera intrinsics for this worker │ │ ├── factory001_worker001_part00.tar # Shard 0 (≤1GB) │ │ └── factory001_worker001_part01.tar # Shard 1 (if needed) │ ├── worker_002/ │ │ ├── intrinsics.json │ │ └── factory001_worker002_part00.tar │ └── worker_011/ │ ├── intrinsics.json │ └── factory001_worker011_part00.tar │ ├── factory_002/ │ └── workers/ │ ├── worker_001/ │ │ ├── intrinsics.json │ │ └── factory002_worker001_part00.tar │ └── ... │ ├── factory_003/ │ └── workers/ │ └── ... │ └── ... (factories 001-085) ``` Each TAR file contains pairs of video and metadata files: ``` factory001_worker001_part00.tar ├── factory001_worker001_00001.mp4 # Video 1 ├── factory001_worker001_00001.json # Metadata for video 1 ├── factory001_worker001_00002.mp4 # Video 2 ├── factory001_worker001_00002.json # Metadata for video 2 ├── factory001_worker001_00003.mp4 # Video 3 ├── factory001_worker001_00003.json # Metadata for video 3 └── ... # Additional video/metadata pairs ``` Each JSON metadata file has the following fields: ```json { "factory_id": "factory_002", // Unique identifier for the factory location "worker_id": "worker_002", // Unique identifier for the worker within factory "video_index": 0, // Sequential index for videos from this worker "duration_sec": 1200.0, // Video duration in seconds "width": 1920, // Video width in pixels "height": 1080, // Video height in pixels "fps": 30.0, // Frames per second "size_bytes": 599697350, // File size in bytes "codec": "h265" // Video codec } ``` ### Loading the Dataset ```python from datasets import load_dataset, Features, Value # Define features features = Features({ 'mp4': Value('binary'), 'json': { 'factory_id': Value('string'), 'worker_id': Value('string'), 'video_index': Value('int64'), 'duration_sec': Value('float64'), 'width': Value('int64'), 'height': Value('int64'), 'fps': Value('float64'), 'size_bytes': Value('int64'), 'codec': Value('string') }, '__key__': Value('string'), '__url__': Value('string') }) # Load entire dataset dataset = load_dataset( "builddotai/Egocentric-10K", streaming=True, features=features ) # Load specific factories dataset = load_dataset( "builddotai/Egocentric-10K", data_files=["factory_001/**/*.tar", "factory_002/**/*.tar"], streaming=True, features=features ) # Load specific workers dataset = load_dataset( "builddotai/Egocentric-10K", data_files=[ "factory_001/workers/worker_001/*.tar", "factory_001/workers/worker_002/*.tar" ], streaming=True, features=features ) ``` ### Loading Intrinsics ```python from huggingface_hub import hf_hub_download import json # Download intrinsics for a specific worker intrinsics_path = hf_hub_download( repo_id="builddotai/Egocentric-10K", filename="factory_001/workers/worker_001/intrinsics.json", repo_type="dataset" ) with open(intrinsics_path) as f: intrinsics = json.load(f) ``` ## License Licensed under the Apache 2.0 License. ## Citation ``` @dataset{buildaiegocentric10k2025, author = {Build AI}, title = {Egocentric-10k}, year = {2025}, publisher = {Hugging Face Datasets}, url = {https://huggingface.co/datasets/builddotai/Egocentric-10K} } ```

![image](https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/ufcEJWVUmQAyF1MqBNOOK.png) Egocentric-10K是目前规模最大的第一人称视角数据集（egocentric dataset），也是首个完全在真实工厂环境中采集的数据集。 <video width="100%" autoplay loop muted playsinline style="border-radius: 8px;"> <source src="https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/lvA-v9UG-Xs77rd4JImJl.mp4" type="video/mp4"> Your browser does not support the video tag. </video> 相较于此前的野外第一人称视角数据集，Egocentric-10K在手部可见度与主动操作密度方面处于当前最优水平。完整的30000帧评估集可在[Egocentric-10K-Evaluation](https://huggingface.co/datasets/builddotai/Egocentric-10K-Evaluation)获取。 ![image](https://cdn-uploads.huggingface.co/production/uploads/690d75303df78b892c337cd4/6T7TGpHO9BGK4qiEBx5f3.png) ## 数据集统计信息 | 属性 | 数值 | |-----------|-------| | **总时长** | 10,000 小时 | | **总帧数** | 10.8 亿 | | **视频片段数** | 192,900 | | **片段中位时长** | 180.0 秒 | | **参与工人数量** | 2,138 | | **单工人平均作业时长** | 4.68 小时 | | **存储总容量** | 16.4 TB | | **编码格式** | H.265/MP4 | | **分辨率** | 1080p（1920×1080） | | **帧率** | 30 fps | | **视场角** | 水平128°，垂直67° | | **相机类型** | 单目头戴式相机 | | **音频支持** | 无 | | **采集设备** | Build AI Gen 1 | ## 相机内参每个工人对应的文件夹中均包含一个`intrinsics.json`文件，存储已校准的相机参数。该内参采用**OpenCV鱼眼模型**（Kannala-Brandt等距投影），包含4个畸变系数（k1~k4），所有参数均针对1920×1080分辨率完成校准。示例`intrinsics.json`文件如下： json { "model": "fisheye", "image_width": 1920, "image_height": 1080, "fx": 1030.59, "fy": 1032.82, "cx": 966.69, "cy": 539.69, "k1": -0.1166, "k2": -0.0236, "k3": 0.0694, "k4": -0.0463 } ## 数据集结构 Egocentric-10K采用**WebDataset格式**（https://huggingface.co/docs/hub/en/datasets-webdataset）进行组织。 builddotai/Egocentric-10K/ ├── factory_001/ │ └── workers/ │ ├── worker_001/ │ │ ├── intrinsics.json # 该工人的相机内参文件 │ │ ├── factory001_worker001_part00.tar # 数据分片0（≤1GB） │ │ └── factory001_worker001_part01.tar # 数据分片1（按需生成） │ ├── worker_002/ │ │ ├── intrinsics.json │ │ └── factory001_worker002_part00.tar │ └── worker_011/ │ ├── intrinsics.json │ └── factory001_worker011_part00.tar │ ├── factory_002/ │ └── workers/ │ ├── worker_001/ │ │ ├── intrinsics.json │ │ └── factory002_worker001_part00.tar │ └── ... │ ├── factory_003/ │ └── workers/ │ └── ... │ └── ...（共001至085号工厂）每个TAR数据分片包含视频与元数据文件对，示例结构如下： factory001_worker001_part00.tar ├── factory001_worker001_00001.mp4 # 视频片段1 ├── factory001_worker001_00001.json # 视频片段1的元数据 ├── factory001_worker001_00002.mp4 # 视频片段2 ├── factory001_worker001_00002.json # 视频片段2的元数据 ├── factory001_worker001_00003.mp4 # 视频片段3 ├── factory001_worker001_00003.json # 视频片段3的元数据 └── ... # 更多视频与元数据对每个JSON元数据文件包含以下字段： json { "factory_id": "factory_002", // 工厂位置唯一标识符 "worker_id": "worker_002", // 工厂内工人唯一标识符 "video_index": 0, // 该工人产出视频的连续索引 "duration_sec": 1200.0, // 视频时长，单位：秒 "width": 1920, // 视频宽度，单位：像素 "height": 1080, // 视频高度，单位：像素 "fps": 30.0, // 帧率，单位：帧每秒 "size_bytes": 599697350, // 文件大小，单位：字节 "codec": "h265" // 视频编码格式 } ### 数据集加载方法 python from datasets import load_dataset, Features, Value # 定义数据特征 features = Features({ 'mp4': Value('binary'), 'json': { 'factory_id': Value('string'), 'worker_id': Value('string'), 'video_index': Value('int64'), 'duration_sec': Value('float64'), 'width': Value('int64'), 'height': Value('int64'), 'fps': Value('float64'), 'size_bytes': Value('int64'), 'codec': Value('string') }, '__key__': Value('string'), '__url__': Value('string') }) # 加载完整数据集 dataset = load_dataset( "builddotai/Egocentric-10K", streaming=True, features=features ) # 加载指定工厂的数据集 dataset = load_dataset( "builddotai/Egocentric-10K", data_files=["factory_001/**/*.tar", "factory_002/**/*.tar"], streaming=True, features=features ) # 加载指定工人的数据集 dataset = load_dataset( "builddotai/Egocentric-10K", data_files=[ "factory_001/workers/worker_001/*.tar", "factory_001/workers/worker_002/*.tar" ], streaming=True, features=features ) ### 内参文件加载方法 python from huggingface_hub import hf_hub_download import json # 下载指定工人的相机内参文件 intrinsics_path = hf_hub_download( repo_id="builddotai/Egocentric-10K", filename="factory_001/workers/worker_001/intrinsics.json", repo_type="dataset" ) with open(intrinsics_path) as f: intrinsics = json.load(f) ## 许可证本数据集采用Apache 2.0许可证开源。 ## 引用格式 @dataset{buildaiegocentric10k2025, author = {Build AI}, title = {Egocentric-10k}, year = {2025}, publisher = {Hugging Face Datasets}, url = {https://huggingface.co/datasets/builddotai/Egocentric-10K} }

提供机构：

maas

创建时间：

2025-11-11

搜集汇总

数据集介绍