nrl-ai/vn-diacritic-eval
收藏Hugging Face2026-04-30 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/nrl-ai/vn-diacritic-eval
下载链接
链接失效反馈官方服务:
资源简介:
一个用于越南语变音符号恢复评估的可重复使用的数据集,涵盖四种不同语域的越南语文本:现代商业/合同/新闻、正式/法律文本、对话和古典文学。数据集以(input, target)对的形式存储,其中input是去除变音符号的文本,target是带有正确变音符号的原始文本。该数据集用于比较不同变音符号恢复模型在多语域平衡网格上的性能。
A reproducible evaluation set for Vietnamese diacritic restoration, covering four registers of Vietnamese text: modern business/contracts/news, formal/legal-prose, conversational, and classical literary. The dataset is stored as (input, target) pairs where input is the diacritic-stripped form and target is the correctly-diacriticized original. It is used to compare the performance of diacritic-restoration models on a register-balanced grid.
提供机构:
nrl-ai



