five

Word Error Rate (WER) under noisy conditions.

收藏
Figshare2026-01-12 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_p_Word_Error_Rate_WER_under_noisy_conditions_p_/31051604
下载链接
链接失效反馈
官方服务:
资源简介:
Spoken Question Answering (SQA) extends machine reading comprehension to spoken content and requires models to handle both automatic speech recognition (ASR) errors and downstream language understanding. Although large-scale SQA benchmarks exist for high-resource languages, Vietnamese remains underexplored due to the lack of standardized datasets. This paper introduces ViSQA, the first benchmark for Vietnamese Spoken Question Answering. ViSQA extends the UIT-ViQuAD corpus using a reproducible text-to-speech and ASR pipeline, resulting in over 13,000 question–answer pairs aligned with spoken inputs. The dataset includes clean and noise-degraded audio variants to enable systematic evaluation under varying transcription quality. Experiments with five transformer-based models show that ASR errors substantially degrade performance (e.g., ViT5 EM: 62.04% 36.30%), while training on spoken transcriptions improves robustness (ViT5 EM: 36.30% 50.70%). ViSQA provides a rigorous benchmark for evaluating Vietnamese SQA systems and enables systematic analysis of the impact of ASR errors on downstream reasoning.
创建时间:
2026-01-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作