THUIAR/MMLA-Datasets
收藏Hugging Face2025-08-13 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/THUIAR/MMLA-Datasets
下载链接
链接失效反馈官方服务:
资源简介:
MMLA数据集是一个用于评估基础模型在多模态语言分析方面的综合基准。它包含了61K+来自不同来源的多模态样本,包括电影、电视剧、YouTube和Bilibili等在线平台。数据集涵盖了三种模态:文本、视频和音频。该数据集关注多模态语言分析的六个核心维度:意图、情绪、情感、对话行为、说话风格和沟通行为。README还提供了包含数据集的统计信息、收集时间表和许可信息。最后,它还展示了一个排行榜,显示了模型的性能,并使用SHA-256校验和进行数据完整性检查。
MMLA is a comprehensive benchmark designed for evaluating foundation models in multimodal language analysis. It consists of 61K+ multimodal samples from various sources such as films, TV series, YouTube, and Bilibili. The dataset encompasses three modalities: text, video, and audio. It focuses on six core dimensions of multimodal language analysis: intent, emotion, sentiment, dialogue act, speaking style, and communication behavior. The README provides statistics on the datasets included, their collection timeline, and licensing information. It concludes with a leaderboard showcasing model performance and data integrity checks using SHA-256 checksums.
提供机构:
THUIAR



