ZeroAgency/shkolkovo-bobr.video-webinars-audio
收藏Hugging Face2025-06-02 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/ZeroAgency/shkolkovo-bobr.video-webinars-audio
下载链接
链接失效反馈官方服务:
资源简介:
shkolkovo-bobr.video-webinars-audio 数据集包含了来自 bobr.video 的约2573个网络研讨会的音频和文本转录。这些网络研讨会是 Shkolkovo 提供的免费在线学校考试培训课程的一部分。数据集的语言为俄语,包含一些英文的网络研讨会。数据集的结构包括格式为 ID.mp3 的 mp3 文件,以及对应的格式为 ID.txt 的文本文件,其中包含时间戳的转录文本。请注意,由于 VAD 的原因,时间戳可能与实际音频不同步。课程材料的版权属于 Shkolkovo.online,该数据集旨在用于语言模型和语音识别/合成模型的训练。
The shkolkovo-bobr.video-webinars-audio dataset consists of audio and text transcriptions from approximately 2573 webinars from bobr.video. These webinars are part of the free online school exam training courses provided by Shkolkovo. The dataset is in Russian, with some webinars in English. The structure of the dataset includes mp3 files in the format ID.mp3, and corresponding txt files in the format ID.txt containing the transcription text with timestamps. Please note that the timestamps may be slightly out of sync with the audio due to VAD. The copyright for the course materials belongs to Shkolkovo.online, and the dataset is intended for training language models and speech recognition/synthesis models.
提供机构:
ZeroAgency



