five

TRACE

收藏
arXiv2023-10-11 更新2024-06-21 收录
下载链接:
https://github.com/BeyonderXX/TRACE
下载链接
链接失效反馈
官方服务:
资源简介:
TRACE数据集由复旦大学计算机科学学院和现代语言学及语言学研究所联合创建,旨在为大型语言模型提供一个全面的持续学习评估基准。数据集包含8个独立的子数据集,覆盖了特定领域任务、多语言能力、代码生成和数学推理等多个挑战领域。每个任务都随机抽样5000个训练实例和2000个测试实例,确保数据集的平衡性。TRACE数据集的应用领域包括评估模型在特定任务上的表现,同时保持其原始能力,以及研究模型在增量训练下的灾难性遗忘问题。

The TRACE dataset was jointly created by the School of Computer Science and the Institute of Modern Linguistics and Linguistics of Fudan University, with the aim of providing a comprehensive continuous learning evaluation benchmark for large language models. The dataset comprises 8 independent sub-datasets, covering multiple challenging domains including domain-specific tasks, multilingual capabilities, code generation, and mathematical reasoning. For each task, 5,000 training instances and 2,000 test instances are randomly sampled to ensure dataset balance. The TRACE dataset can be applied to evaluate a model's task-specific performance while preserving its original capabilities, as well as to study the catastrophic forgetting issue of models under incremental training.
提供机构:
复旦大学计算机科学学院,上海,中国
创建时间:
2023-10-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作