A dataset of Chinese-Mongolian-Tibetan-Uyghur for multi-document summarization

Name: A dataset of Chinese-Mongolian-Tibetan-Uyghur for multi-document summarization
Creator: Science Data Bank
Published: 2025-04-27 22:59:20
License: 暂无描述

DataCite Commons2025-04-27 更新2025-04-16 收录

下载链接：

https://www.scidb.cn/detail?dataSetId=d68a068858cd4476a9df02ea0e3ff646

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset comprises multi-document summaries in Chinese, Mongolian, Tibetan, and Uyghur, with high alignment between data in each language. Each language includes 1044 clusters of news articles for each language, totaling 6234 news articles. Each article is stored in a TXT file, with the name of the news event and the corresponding cluster summary saved in an XLSX spreadsheet. The multi-document summary data for each language are stored in their respective compressed files.

提供机构：

Science Data Bank

创建时间：

2024-05-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集