five

Kurdish Summarization Dataset (v2)

收藏
DataCite Commons2025-05-01 更新2025-05-17 收录
下载链接:
https://data.mendeley.com/datasets/gczpg2cnxy
下载链接
链接失效反馈
官方服务:
资源简介:
The first Kurdish summarization dataset is a comprehensive collection of summaries from over 40,000 news articles and headlines written in the Sorani dialect of the Kurdish language. The articles cover topics such as political, economic, sport, religion, science, social, art and health. The dataset has been created to aid in the development and improvement of machine learning algorithms and natural language processing systems for summarization task in the Kurdish language. The dataset contains high-quality summaries that are created by human annotators. Each Summary is a considered version of the original article and headline, capturing its key information and important points in a concise manner. With the help of this dataset, researchers and developers can train and evaluate their summarization models for the Kurdish language, which can lead to the creation of more accurate and effective summarization tools. The dataset is a significant contribution to the development of natural language processing technologies for the Kurdish language and can open up new avenues for research and innovation in the field.
提供机构:
Mendeley
创建时间:
2023-04-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作