five

M3AV

收藏
arXiv2024-03-21 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2403.14168v1
下载链接
链接失效反馈
官方服务:
资源简介:
M3AV数据集是由上海交通大学、清华大学、剑桥大学工程系和上海人工智能实验室共同创建,涵盖计算机科学、生物医学科学和数学等多个学术领域。该数据集包含1113个视频,总时长约367小时,每个视频均配有高质量的语音转录和OCR标注,特别强调了高价值实体的标注。M3AV数据集不仅支持多模态内容识别,还支持对学术知识的深入理解,适用于自动语音识别、语音合成和幻灯片及脚本生成等多种任务,旨在帮助研究人员加速学术研究和知识传播。

The M3AV dataset was jointly created by Shanghai Jiao Tong University, Tsinghua University, Department of Engineering of the University of Cambridge, and Shanghai AI Laboratory, covering multiple academic disciplines including computer science, biomedical science, and mathematics. This dataset contains 1,113 videos with a total duration of approximately 367 hours. Each video is accompanied by high-quality speech transcripts and OCR annotations, with particular emphasis on the annotation of high-value entities. The M3AV dataset not only supports multimodal content recognition but also enables in-depth understanding of academic knowledge. It is applicable to various tasks such as automatic speech recognition, speech synthesis, slide deck and script generation, and aims to assist researchers in accelerating academic research and knowledge dissemination.
提供机构:
上海交通大学, 清华大学, 剑桥大学工程系, 上海人工智能实验室
创建时间:
2024-03-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作