juanjucm/FLEURS-SpeechT-GL-EN
收藏Hugging Face2024-12-18 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/juanjucm/FLEURS-SpeechT-GL-EN
下载链接
链接失效反馈官方服务:
资源简介:
FLEURS-SpeechT-GL-EN数据集是一个用于加利亚语到英语语音翻译任务的数据集。该数据集包含约10小时11分钟的加利亚语音频及其文本转录和相应的英语翻译。数据集基于Google的FLEURS语音数据集,通过对齐英语和加利亚语数据构建。此外,还应用了质量评估模型来评估英语翻译的质量,平均得分为0.76。数据集的结构包括训练集、验证集和测试集,每个集都包含id、音频、加利亚语文本和英语文本等特征。
FLEURS-SpeechT-GL-EN is a Galician-to-English dataset for Speech Translation task. This dataset contains ~10h11m of Galician audios along with its text transcriptions and the corresponding English translations. The dataset is based on Googles FLEURS speech dataset, by aligning English and Galician data. Additionally, a Quality Estimation model has been applied to assess the quality of the English translations, with an average score of 0.76. The dataset structure includes train, validation, and test sets, each containing features such as id, audio, Galician text, and English text.
提供机构:
juanjucm



