ESpeech/ESpeech-upvote

Name: ESpeech/ESpeech-upvote
Creator: ESpeech
Published: 2025-08-25 12:32:59
License: 暂无描述

Hugging Face2025-08-25 更新2025-09-13 收录

下载链接：

https://hf-mirror.com/datasets/ESpeech/ESpeech-upvote

下载链接

链接失效反馈

官方服务：

资源简介：

Upvote YouTube音频数据集包含了从Upvote YouTube频道提取的296小时的音频片段及其对应的元数据。每个音频文件代表频道视频内容的一个片段，以44.1kHz的采样率进行处理。该数据集适用于文本转语音（TTS）、自动语音识别（ASR）和语音质量评估任务。数据集的语言为俄语，音频格式为MP3，采样率为44.1kHz。数据集的结构包括音频数据、文件名、片段索引、原始视频名称、音频片段的转录文本、起始和结束时间、单词级别的时间戳和置信度分数、说话者信息、质量指标、片段结构以及语音活动检测（VAD）相关信息。所有可用的YouTube视频片段都被用作训练集。

The Upvote YouTube Audio Dataset contains 296 hours of processed audio segments extracted from the Upvote YouTube channel along with corresponding metadata. Each audio file is a segment from the channels videos and content, processed at a 44.1kHz sample rate. The dataset is intended for tasks such as text-to-speech (TTS), automatic speech recognition (ASR), and quality assessment. The language of the dataset is Russian, and the audio format is MP3 at a 44.1kHz sample rate. The structure of the dataset includes audio data, file names, segment indexes, original video names, transcribed text of the audio segments, start and end times, word-level timestamps and confidence scores, speaker information, quality metrics, segment structure, and voice activity detection (VAD) related information. All available YouTube video segments are used as the training set.

提供机构：

ESpeech

5,000+

优质数据集

54 个

任务类型

进入经典数据集