ZurichNLP/SimpEvalDE
收藏Hugging Face2026-01-10 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/ZurichNLP/SimpEvalDE
下载链接
链接失效反馈官方服务:
资源简介:
SimpEvalDE是一个组合的德语文本简化评估数据集,用于训练和评估DETECT指标。该数据集基于专有的APA-LHA和DePLAIN-APA语料库构建,需要从各自作者处获取权限。数据集通过四个步骤创建:合并专有源、加入手动分类/对齐、为每行添加六个ATS生成、添加基于LLM的度量监督和人工评分。数据集包含训练和测试分割,分别有600和360行数据。每行数据包含原始句子、参考简化、一个ATS简化、生成模型ID、LLM监督分数和人工评分(仅测试集)。
SimpEvalDE is a composed German text simplification evaluation dataset assembled to train and evaluate the DETECT metric. The dataset builds on the proprietary APA-LHA and DePLAIN-APA corpora, which must be requested from their respective authors. It is created in four steps: merging proprietary sources, joining with manual categorization/alignment, adding six ATS generations per row, and adding LLM-based metric supervision & human grades. The dataset includes train and test splits with 600 and 360 rows respectively. Each row contains the original sentence, reference simplifications, one ATS simplification, the generator model ID, LLM supervision scores, and human grades (test split only).
提供机构:
ZurichNLP



