five

Development of an Adaptive Text Summarization System Based on the T5 Transformer Model for Students with Dyslexia

收藏
Mendeley Data2026-07-04 收录
下载链接:
https://data.mendeley.com/datasets/zv9d4x5vsp
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains 110 Indonesian text samples developed to support research in automatic sign language translation, text simplification, and natural language processing. The data consist of original Indonesian texts accompanied by three levels of simplified target texts, namely Easy, Medium, and Hard. The Easy version uses simple vocabulary and short sentence structures to facilitate understanding, while the Medium version maintains more contextual information with moderate simplification. The Hard version preserves the original meaning and sentence structure with minimal modifications. The dataset was manually compiled and annotated to simulate linguistic transformations commonly required in sign language translation systems, where complex sentence structures are simplified while maintaining semantic accuracy. Each record includes an identifier, original text, simplified target texts, source page information, and relevant keywords. This dataset is intended for applications such as text simplification, machine translation, sign language translation, accessibility technologies for deaf and hard-of-hearing communities, and the development of transformer-based language models including T5, mT5, BART, and IndoBART. The dataset is provided in Microsoft Excel (.xlsx) format and serves as a valuable resource for researchers working in Indonesian language processing, educational technology, and assistive communication systems.
创建时间:
2026-06-11
二维码
社区交流群
二维码
科研交流群
商业服务