five

Turkish Procedural Language Understanding (PLU) Corpus

收藏
arXiv2024-03-07 更新2024-06-21 收录
下载链接:
https://github.com/GGLAB-KU/turkish-plu
下载链接
链接失效反馈
官方服务:
资源简介:
土耳其程序性语言理解(PLU)语料库是由约翰斯·霍普金斯大学和科克大学合作创建的,旨在解决低资源语言程序性文本理解的挑战。该数据集包含超过52000个教程,涵盖广泛领域,如计算机和电子、健康、爱好和工艺等。创建过程包括从原始土耳其wikiHow抓取教程和使用自动化翻译工具翻译英文教程。数据集的应用领域包括链接动作、目标推理、步骤推理、步骤排序、下一事件预测和摘要生成,旨在通过这些任务评估和提升模型在程序性语言理解方面的性能。

The Turkish Procedural Language Understanding (PLU) Corpus was collaboratively developed by Johns Hopkins University and University College Cork to address the challenges of procedural text understanding for low-resource languages. This corpus includes over 52,000 tutorials spanning a diverse set of domains such as computer and electronics, health, hobbies and crafts, among others. The corpus construction workflow involved scraping tutorials from the original Turkish wikiHow and translating English-language tutorials via automated translation tools. Its applicable downstream tasks include action linking, goal reasoning, step reasoning, step ordering, next event prediction and summary generation, which are designed to evaluate and enhance model performance in procedural language understanding.
提供机构:
约翰斯·霍普金斯大学计算机科学系
创建时间:
2023-09-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作