umanlp/xscitldr
收藏数据集概述
名称: X-SCITLDR
全称: Cross-Lingual Extreme Summarization of Scholarly Documents
目的: 为解决科学出版物数量激增导致的信息过载问题,本数据集专注于跨语言的学术文档极简摘要生成,支持从英语到其他四种语言的摘要生成。
支持语言
- 德语
- 意大利语
- 中文
- 日语
数据集特点
- 提供跨语言摘要功能,支持英语到德语、意大利语、中文和日语的摘要生成。
- 基于最新的多语言预训练模型进行模型训练和评估。
- 探索了零样本和少样本学习场景下的模型性能。
引用信息
@inproceedings{takeshita-etal-2022-xsci, author = {Takeshita, Sotaro and Green, Tommaso and Friedrich, Niklas and Eckert, Kai and Ponzetto, Simone Paolo}, title = {X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents}, year = {2022}, isbn = {9781450393454}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3529372.3530938}, doi = {10.1145/3529372.3530938}, abstract = {详细描述了数据集的目的和功能}, booktitle = {Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries}, articleno = {4}, numpages = {12}, keywords = {scholarly document processing, summarization, multilinguality}, location = {Cologne, Germany}, series = {JCDL 22} }



