WueNLP/belebele-fleurs
收藏Hugging Face2024-12-12 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/WueNLP/belebele-fleurs
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个多语言数据集,支持多种任务,包括音频分类、自动语音识别、音频到文本转换、文本到语音转换、问答和文档问答。数据集包含多个配置,每个配置都有详细的特征描述,如链接、问题编号、Flores段落、问题、多项选择答案、正确答案编号、方言、时间戳和句子数据。句子数据包含多个属性,如URL、音频、领域、文件名、fleurs_id、完整段落、性别、是否有超链接、是否有图像、ID、样本数量、原始转录、seamlessm4t_asr、seamlessm4t_asr_cer、seamlessm4t_asr_translation、seamlessm4t_asr_wer、句子、句子索引、说话者ID、分割、主题、转录、whisper_asr、whisper_asr_cer、whisper_asr_translation和whisper_asr_wer。
This dataset is a multilingual dataset that supports various tasks including audio classification, automatic speech recognition, audio-text-to-text, text-to-speech, question answering, and document question answering. The dataset includes multiple configurations, each with detailed feature descriptions such as link, question number, flores passage, question, multiple-choice answers, correct answer number, dialect, timestamp, and sentence data. The sentence data contains various attributes like URL, audio, domain, filename, fleurs_id, full_paragraph, gender, has_hyperlink, has_image, id, num_samples, raw_transcription, seamlessm4t_asr, seamlessm4t_asr_cer, seamlessm4t_asr_translation, seamlessm4t_asr_wer, sentence, sentence_idx, speaker_id, split, topic, transcription, whisper_asr, whisper_asr_cer, whisper_asr_translation, and whisper_asr_wer.
提供机构:
WueNLP



