aisingapore/nlg-abstractive_summarization

Name: aisingapore/nlg-abstractive_summarization
Creator: aisingapore
Published: 2024-12-20 02:01:14
License: 暂无描述

Hugging Face2024-12-20 更新2024-12-21 收录

下载链接：

https://hf-mirror.com/datasets/aisingapore/nlg-abstractive_summarization

下载链接

链接失效反馈

官方服务：

资源简介：

SEA Abstractive Summarization数据集用于评估模型在阅读文档、识别关键点并将其总结为连贯且流畅的文本时的能力，同时进行文档的释义。该数据集从XL-Sum中采样，涵盖了印尼语、泰米尔语、泰语和越南语。数据集支持的任务是评估聊天或指令调整的大型语言模型（LLMs），并且是SEA-HELM排行榜的一部分。数据集按语言划分，并包含fewshot示例的分割。数据集的统计信息包括每个分割的示例数量以及不同模型的token数量。数据来源为XL-Sum，许可证为CC BY-NC-SA 4.0。

The SEA Abstractive Summarization dataset is designed to evaluate a models ability to generate coherent and fluent summaries after reading a document, specifically for evaluating chat or instruction-tuned large language models (LLMs). The dataset is sampled from XL-Sum and covers Indonesian, Tamil, Thai, and Vietnamese languages. The dataset is split by language and includes additional fewshot example splits. Statistics for each split include the number of examples and token counts for different models such as GPT-4o, Gemma 2, and Llama 3. The dataset is licensed under CC BY-NC-SA 4.0, with data sourced from XL-Sum, ensuring that the data used is permissible and excludes copyrighted or disputed data.

提供机构：

aisingapore

5,000+

优质数据集

54 个

任务类型

进入经典数据集