ESpeech/ESpeech-igm
收藏Hugging Face2025-08-25 更新2025-09-13 收录
下载链接:
https://hf-mirror.com/datasets/ESpeech/ESpeech-igm
下载链接
链接失效反馈官方服务:
资源简介:
IGM YouTube音频数据集包含从IGM YouTube频道提取的220小时处理过的音频片段及其对应的元数据。每个音频文件代表IGM教育视频和讲座的一个片段,采样率为44.1kHz。数据集包含俄语文本,适用于文本到语音(TTS)、自动语音识别(ASR)和语音质量评估任务。音频格式为MP3,44.1kHz采样率,数据结构包括音频数据、文件名、片段索引、原始视频名称、音频片段的转录文本和时间戳、说话人信息、质量度量和片段结构等信息。
The IGM YouTube Audio Dataset contains 220 hours of processed audio segments extracted from the IGM YouTube channel with corresponding metadata. Each audio file represents a segment from IGMs educational videos and lectures, processed at 44.1kHz sample rate. The dataset includes Russian text and is suitable for Text-to-Speech (TTS), Automatic Speech Recognition (ASR), and Quality Assessment tasks. The audio format is MP3 at a 44.1kHz sample rate, and the data structure includes audio data, file name, segment index, original video name, transcribed text of the audio segment with timestamps, speaker information, quality metrics, and segment structure information.
提供机构:
ESpeech



