3MASSIV

arXiv2022-03-28 更新2024-06-21 收录

下载链接：

https://sharechat.com/research/3massiv

下载链接

链接失效反馈

官方服务：

资源简介：

3MASSIV数据集是由印度ShareChat公司和美国马里兰大学合作创建的，包含50,000个经过专家注释的短视频和100,000个未标记视频，这些视频来自流行的社交短视频平台Moj。数据集涵盖11种语言，视频内容多样，包括恶作剧、失败、浪漫、喜剧等，形式包括自拍视频、反应视频、对口型、自唱歌曲等。3MASSIV数据集通过注释视频的概念、情感状态、媒体类型和音频语言，为多模态和多语言语义理解提供了丰富的资源。该数据集的应用领域包括多语言建模、创作者建模和时间分析，旨在解决社交短视频内容的语义理解和跨语言分析问题。

The 3MASSIV dataset was collaboratively created by India's ShareChat and the University of Maryland, United States. It comprises 50,000 expert-annotated short videos and 100,000 unlabeled videos, all sourced from the popular social short-video platform Moj. The dataset spans 11 languages, featuring diverse video content including pranks, fails, romantic clips, comedy skits and more, with forms such as self-shot videos, reaction videos, lip-sync performances, self-performed songs and others. It provides abundant resources for multimodal and multilingual semantic understanding via annotations covering video concepts, emotional states, media types and audio languages. Its application areas include multilingual modeling, creator modeling and temporal analysis, aiming to address the challenges of semantic understanding and cross-linguistic analysis of social short-video content.

提供机构：

ShareChat, 印度和马里兰大学, 美国

创建时间：

2022-03-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集