typhoon-ai/TVSpeech
收藏Hugging Face2026-01-21 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/typhoon-ai/TVSpeech
下载链接
链接失效反馈官方服务:
资源简介:
TVSpeech是一个专门设计用于评估自动语音识别(ASR)模型在真实世界泰语音频中鲁棒性的基准数据集。该数据集包含570个从YouTube公共媒体频道精选的语音样本(总计3.75小时),涵盖了金融、技术和多样化的vlog等内容类别。数据集特别注重语音的声学和语义复杂性,包括领域特定术语、专有名词和技术行话等低频率词汇。所有样本均经过人工转录,采样率为16 kHz,并采用Creative Commons Attribution (CC-BY)许可。该数据集仅包含测试集,用于评估ASR模型在嘈杂环境和复杂语义下的表现。
TVSpeech is a Thai speech recognition benchmark dataset specifically designed as a Robustness Track for evaluating ASR models on real-world, in-the-wild Thai audio. The dataset consists of 570 utterances (3.75 hours) curated from diverse public media channels on YouTube under the Creative Commons Attribution (CC-BY) license, representing challenging acoustic and semantic complexity found in natural speech. It includes domain-specific terminology, proper names, and technical jargon, with a focus on low-frequency terms. The dataset is manually transcribed, sampled at 16 kHz, and contains only a test split for evaluation purposes.
提供机构:
typhoon-ai



