five

SauLTC genres.

收藏
Figshare2024-10-23 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/SauLTC_genres_/27287135
下载链接
链接失效反馈
官方服务:
资源简介:
This article introduces the Saudi Learner Translation Corpus (SauLTC), an innovative multi-version English–Arabic parallel corpus featuring part-of-speech tagging. We describe the corpus parameters and compilation process and explain how textual processing and sentence alignment are conducted. The participants include 366 student translators, 48 instructors, and 23 alignment verifiers. The corpus provides access to two target versions of every ST to allow the detection of the changes in the translation and revision processes from the initial to the final draft. The translations were collected over three years, yielding 5,160,386 tokens. The metadata of 23 sentence alignment verifiers were added to the analysis as a unique variable to investigate individual differences in the manual verification process. This unidirectional corpus can be used to identify student translators’ strategies and errors in translation and analyze the efficacy of instructors’ feedback. Furthermore, it is accessible via an application and a website. It provides translation teachers and researchers with a database that can help develop corpus-based and corpus-driven teaching materials.
创建时间:
2024-10-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作