Word Error Rate (WER) under noisy conditions.

Figshare2026-01-12 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/_p_Word_Error_Rate_WER_under_noisy_conditions_p_/31051604

下载链接

链接失效反馈

官方服务：

资源简介：

Spoken Question Answering (SQA) extends machine reading comprehension to spoken content and requires models to handle both automatic speech recognition (ASR) errors and downstream language understanding. Although large-scale SQA benchmarks exist for high-resource languages, Vietnamese remains underexplored due to the lack of standardized datasets. This paper introduces ViSQA, the first benchmark for Vietnamese Spoken Question Answering. ViSQA extends the UIT-ViQuAD corpus using a reproducible text-to-speech and ASR pipeline, resulting in over 13,000 question–answer pairs aligned with spoken inputs. The dataset includes clean and noise-degraded audio variants to enable systematic evaluation under varying transcription quality. Experiments with five transformer-based models show that ASR errors substantially degrade performance (e.g., ViT5 EM: 62.04% 36.30%), while training on spoken transcriptions improves robustness (ViT5 EM: 36.30% 50.70%). ViSQA provides a rigorous benchmark for evaluating Vietnamese SQA systems and enables systematic analysis of the impact of ASR errors on downstream reasoning.

创建时间：

2026-01-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集