five

MuAViC

收藏
arXiv2023-03-08 更新2024-06-21 收录
下载链接:
https://github.com/facebookresearch/muavic
下载链接
链接失效反馈
官方服务:
资源简介:
MuAViC是由Meta AI创建的多语言视听语料库,专注于增强语音识别和语音到文本翻译的鲁棒性。该数据集包含1200小时的视听语音数据,涵盖9种语言,包括英语、阿拉伯语、德语、希腊语、西班牙语、法语、意大利语、葡萄牙语和俄语。数据来源于TED和TEDx演讲,涉及超过8000名演讲者。MuAViC不仅提供了文本翻译,还为6种英语到其他语言和6种其他语言到英语的翻译方向建立了基准。这是首个公开的视听语音到文本翻译语料库,也是目前最大的多语言视听语音识别公开基准。该数据集的创建旨在通过视听信息提高语音处理系统的噪声鲁棒性,并推动多语言预训练技术的发展。

MuAViC is a multilingual audiovisual corpus developed by Meta AI, dedicated to enhancing the robustness of speech recognition and speech-to-text translation systems. This corpus contains 1,200 hours of audiovisual speech data spanning 9 languages, namely English, Arabic, German, Greek, Spanish, French, Italian, Portuguese, and Russian. The data is sourced from TED and TEDx talks, involving over 8,000 speakers. Beyond providing text translations, MuAViC has established benchmarks for 12 translation directions: 6 from English to other languages and 6 from other languages to English. It is the first publicly available audiovisual speech-to-text translation corpus, and currently the largest public benchmark for multilingual audiovisual speech recognition. This corpus was developed to improve the noise robustness of speech processing systems via audiovisual information and advance the development of multilingual pre-training technologies.
提供机构:
Meta AI
创建时间:
2023-03-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作