five

1K+ Hours of Selfie Video Data | AI Training Data | Annotated Video for AI | Bounding Boxes, ...

收藏
Databricks2026-01-27 收录
下载链接:
https://marketplace.databricks.com/details/3784465c-399c-4fe8-a4ff-dedbf2360b19/Data-Seeds_1K+-Hours-of-Selfie-Video-Data-AI-Training-Data-Annotated-Video-for-AI-Bounding-Boxes,-
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains over 1,000 hours of facial expression selfie video recordings captured worldwide. Designed for AI and machine-learning applications, it offers richly annotated, context-dense video data ideal for training vision-language models, action-recognition systems, and multimodal reasoning. Key Features 1. Comprehensive Video Annotation Layers Each video includes synchronized metadata across visual and audio channels, such as: Object annotations (bounding boxes, segmentation masks) Action labels and activity timelines Temporal event boundaries Transcripts for scenes containing speech Visual scene descriptions covering environment, objects, actions, and context Camera metadata (motion type, angle, field of view, lighting conditions) This enables training for activity detection, video captioning, tracking, VLM grounding, and multimodal understanding. 2. Unique Sourcing Capabilities Videos are collected through controlled contribution pipelines designed to generate authentic, unscripted real-world footage. This provides: Natural human movement and behavior Diverse environments and camera devices Continuous flow of fresh recordings Ability to generate custom datasets (e.g., specific actions, locations, lighting, demographics, or motion patterns) 3. Global Visual & Cultural Diversity Contributors from 100+ countries supply: Indoor and outdoor recordings Urban, rural, and specialized environments Varied cultural behaviors, activities, and settings Multiple languages and speaking styles where speech is present This ensures robust generalization for global deployment. 4. High-Quality, Realistic Video Capture Data includes a wide range of visual conditions: 4K, HD, and consumer-grade recordings Static, handheld, and moving cameras Low-light, daylight, and variable lighting Clean vs. noisy audio channels Natural occlusions, motion blur, and complex backgrounds This diversity supports training models for real-world reliability and robustness. 5. AI-Ready Dataset Architecture Optimized for modern ML workflows, enabling: Video classification and action recognition Video captioning and summarization Vision-language model (VLM) alignment Multimodal reasoning and grounding Safety, moderation, and risk detection Tracking, segmentation, and object detection Compatible with leading ML frameworks and training pipelines. 6. Licensing & Compliance Fully compliant with global privacy standards Explicit contributor consent for video usage Documented rights and usage permissions Vetted for commercial and research use Use Cases Training video classification and action-recognition models Vision-language model pretraining Multimodal AI for enterprise and consumer applications Safety, moderation, and anomaly detection Video captioning, retrieval, and summarization Research in activity analysis, human behavior, and multimodal grounding
提供机构:
Data Seeds
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作