five

Persian Causality Corpus (PerCause)

收藏
arXiv2021-06-27 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2106.14165v1
下载链接
链接失效反馈
官方服务:
资源简介:
Persian Causality Corpus (PerCause) 是由谢里夫理工大学自然语言处理实验室开发的因果关系标注语料库,专门针对波斯语。该数据集包含4446个句子,总计5128个因果关系,每个关系被标注为原因、效果或因果标记。PerCause旨在为波斯语提供一个全面的因果关系资源,用于训练因果关系检测系统。数据集从Bijankhan语料库和一般书籍中选取,涵盖了广泛的因果关系类型,但不包括复杂的因果关系如隐含的或嵌套的因果关系。该数据集的应用领域包括自然语言处理任务,如文本蕴含识别、问答、事件预测和叙述提取,旨在解决波斯语中因果关系识别的挑战。

The Persian Causality Corpus (PerCause) is a causality-annotated corpus developed by the Natural Language Processing Laboratory of Sharif University of Technology, specifically tailored for the Persian language. This dataset contains 4,446 sentences and a total of 5,128 causal relations, each annotated as a cause, effect, or causal marker. PerCause is designed to offer a comprehensive causal relation resource for Persian, enabling the training of causality detection systems. The dataset is curated from the Bijankhan Corpus and general books, covering a wide range of causal relation types, but excluding complex causal relations such as implicit or nested ones. Application scenarios of this dataset cover natural language processing tasks including textual entailment recognition, question answering, event prediction, and narrative extraction, with the goal of addressing the challenges of causal relation recognition in Persian.
提供机构:
谢里夫理工大学自然语言处理实验室
创建时间:
2021-06-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作