reesjon9/Latin-Audio
收藏Hugging Face2025-12-17 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/reesjon9/Latin-Audio
下载链接
链接失效反馈官方服务:
资源简介:
Vox Classica是一个拉丁语语音语料库,包含约73小时的音频,按句子分割成短音频片段。这是一个大规模、适合机器学习的人类朗读古典拉丁语数据集,旨在解决公开可用的人类朗读拉丁语语料库不足的问题。数据集主要用于训练和评估古典拉丁语的语音处理模型,特别是自动语音识别(ASR)和文本到语音(TTS)模型。每个示例包含一个转录文本(使用CLTK macronizer自动添加了正确的元音长度信息)和对应的音频文件(mp3格式)。数据集的创建过程包括使用未训练的whisper-large-v3模型生成粗略转录,并通过模糊字符串匹配算法定位每个句子的起始和结束时间戳,最后由人工验证和调整以确保精确对齐。
Vox Classica is a Latin speech corpus of ~73 hours of audio, segmented into short audio clips by sentence. Vox Classica is a large-scale, ML-ready dataset of human-read Classical Latin. It was designed to address the absence of a publicly available human-read Latin corpus large enough for model training. This dataset is built for training and evaluating speech processing models for Classical Latin, primarily for Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) models. Each example contains a transcription (automatically macronized using the CLTK macronizer) and the corresponding audio (mp3 format). The dataset was created by segmenting long-form audio recordings and corresponding texts into short clips using an untrained whisper-large-v3 model for rough transcripts and fuzzy string matching for alignment, with manual verification and adjustment by the curator.
提供机构:
reesjon9



