five

RobotsMali/jeli-data-manifest

收藏
Hugging Face2024-12-07 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/RobotsMali/jeli-data-manifest
下载链接
链接失效反馈
官方服务:
资源简介:
Jeli音频数据集是一个多语言音频数据集,包含班巴拉语和法语的音频样本。每个音频文件都配有班巴拉语的转录或法语的翻译(在清单文件中提供)。数据集设计用于自动语音识别(ASR)和翻译任务。数据在马里以有组织的方式录制,并由半专业人士转录和翻译成法语。数据集包含11,533个音频文件,总时长约30小时,分为训练集和测试集。数据集的结构包括音频文件、转录清单和法语转录清单,以及用于处理数据和创建清单的脚本。

The Jeli Audio Dataset is a multilingual audio dataset containing audio samples in Bambara and French. Each audio file is paired with its transcription in Bambara or its translation in French (available in manifest files). The dataset is designed for tasks like automatic speech recognition (ASR) and translation. Data was recorded in an organized setup in Mali with griots and semi-professionally transcribed, and translated into French. The dataset consists of 11,533 audio files with a total duration of approximately 30 hours, divided into training and test sets. The dataset structure includes audio files, transcription manifests, French transcription manifests, and scripts for processing data and creating manifests.
提供机构:
RobotsMali
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作