TextComplexityDE1
收藏arXiv2019-04-16 更新2024-06-21 收录
下载链接:
http://tiny.cc/mq643y
下载链接
链接失效反馈官方服务:
资源简介:
TextComplexityDE1数据集由柏林工业大学质量与用户体验实验室创建,包含1000个德语句子,来源于23篇维基百科文章,涵盖历史、社会和科学三个领域。数据集旨在用于开发文本复杂度预测模型和德语文本自动简化系统。该数据集包括德语学习者对不同文本复杂度方面的主观评估,以及250个句子由母语者提供的简化版本及其主观评估。数据收集通过实验室研究和众包方法进行,主要应用于解决文本复杂度和可读性问题,特别是在帮助非母语者和有阅读障碍的读者理解复杂文本。
The TextComplexityDE1 dataset was created by the Quality and User Experience Lab at Technische Universität Berlin. It includes 1,000 German sentences extracted from 23 Wikipedia articles, covering three domains: history, social sciences, and natural sciences. The dataset is intended for the development of text complexity prediction models and automatic German text simplification systems. It features subjective assessments of various aspects of text complexity made by German language learners, as well as simplified versions of 250 sentences provided by native speakers paired with their corresponding subjective evaluations. Data was collected through laboratory studies and crowdsourcing methods. Its main applications target resolving text complexity and readability issues, particularly in aiding non-native speakers and readers with reading disabilities to comprehend complex texts.
提供机构:
柏林工业大学质量与用户体验实验室
创建时间:
2019-04-16



