nglaura/koreascience-summarization
收藏Hugging Face2023-04-11 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/nglaura/koreascience-summarization
下载链接
链接失效反馈官方服务:
资源简介:
KoreaScience是一个用于韩语研究论文摘要生成的数据集,提供了布局信息。数据字段包括文章ID、文章内容、单词边界框、归一化边界框、摘要和PDF链接。数据集分为训练集、验证集和测试集,分别包含35,248、1,125和1,125个实例。
提供机构:
nglaura
原始信息汇总
LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization
KoreaScience Dataset for Summarization
Overview
- Language: Korean
- Purpose: Summarization of research papers with layout information
- Collaborators: reciTAL, MLIA (ISIR, Sorbonne Université), Meta AI, Università di Trento
Data Fields
article_id: Unique identifier for the articlearticle_words: Sequence of words in the articlearticle_bboxes: Sequence of corresponding word bounding boxesnorm_article_bboxes: Sequence of normalized word bounding boxesabstract: Abstract of the articlearticle_pdf_url: URL to the articles PDF
Data Splits
- Train: 35,248 instances
- Validation: 1,125 instances
- Test: 1,125 instances
License
- License: Apache-2.0
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



