five

BAMBOO

收藏
arXiv2024-03-19 更新2024-06-21 收录
下载链接:
https://github.com/RUCAIBox/BAMBOO
下载链接
链接失效反馈
官方服务:
资源简介:
BAMBOO是一个全面评估大型语言模型长文本建模能力的多任务基准。该数据集由中国人民大学高瓴人工智能学院创建,包含来自5种不同长文本理解任务的10个数据集,覆盖问答、幻觉检测、文本排序、语言建模和代码完成等多个领域。BAMBOO旨在通过不同的任务和领域,全面评估语言生成、知识利用、推理和工具操作等能力。数据集分为两个子集:BAMBOO-4k和BAMBOO-16k,分别对应不同的输入长度级别。BAMBOO的设计遵循四个原则:全面的能力评估、避免数据污染、准确的自动评估和不同长度级别的适应性。

BAMBOO is a multi-task benchmark designed for comprehensive evaluation of the long-text modeling capabilities of large language models. Developed by the Gaoling School of Artificial Intelligence, Renmin University of China, this benchmark comprises 10 datasets spanning 5 distinct long-text understanding tasks, covering domains such as question answering, hallucination detection, text ranking, language modeling, and code completion. BAMBOO aims to comprehensively assess capabilities including language generation, knowledge utilization, reasoning, and tool operation across diverse tasks and domains. The dataset is divided into two subsets: BAMBOO-4k and BAMBOO-16k, corresponding to different input length tiers respectively. Four core principles guide the design of BAMBOO: comprehensive capability evaluation, avoidance of data contamination, accurate automatic evaluation, and adaptability to varying input length levels.
提供机构:
高瓴人工智能学院
创建时间:
2023-09-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作