NoticIA
收藏arXiv2024-04-11 更新2024-06-21 收录
下载链接:
https://hf.co/datasets/Iker/NoticIA
下载链接
链接失效反馈官方服务:
资源简介:
NoticIA数据集由巴斯克语言技术中心 - Ixa大学创建,包含850篇西班牙语新闻文章,每篇文章都配有引人注目的点击诱饵标题和高质量的人工编写单句摘要。该数据集旨在挑战模型对文本理解和摘要的能力,特别是从大量无关信息中提取关键信息以满足用户由点击诱饵标题产生的信息需求。NoticIA数据集的应用领域包括评估和训练大型语言模型,以提高其在西班牙语环境下的文本理解和摘要生成能力。
The NoticIA dataset was created by the Basque Language Technology Center – Ixa University. It contains 850 Spanish-language news articles, each paired with a striking clickbait headline and a high-quality human-written single-sentence summary. This dataset aims to challenge models' text understanding and summarization capabilities, particularly in extracting key information from massive irrelevant content to satisfy users' information needs driven by clickbait headlines. The application fields of the NoticIA dataset include evaluating and training Large Language Models (LLMs) to enhance their text understanding and summarization generation abilities in the Spanish language context.
提供机构:
巴斯克语言技术中心 - Ixa大学
创建时间:
2024-04-11



