five

RAFT

收藏
arXiv2022-01-19 更新2024-06-21 收录
下载链接:
https://raft.elicit.org/datasets
下载链接
链接失效反馈
官方服务:
资源简介:
RAFT(Realworld Annotated Few-shot Tasks)是一个包含11个自然发生的分类数据集的基准,旨在模拟通常交给人类研究助理的工作。每个任务都附有自然语言指令和标签,以及50个训练示例。这些数据集覆盖了广泛的具有经济价值的任务,如仇恨言论检测、医疗案例报告解析和文献审查自动化等。创建过程中,数据集的选择基于自然发生的数据,具有内在价值,并且不排除具有严重不平衡类的数据集。RAFT的应用领域广泛,旨在解决当前技术在处理长文本和多类别任务时的困难,以及推动模型在实际应用中的进步。

RAFT (Realworld Annotated Few-shot Tasks) is a benchmark consisting of 11 naturally occurring classification datasets, designed to simulate the typical workload assigned to human research assistants. Each task is accompanied by natural language instructions and labels, along with 50 training examples. These datasets cover a wide range of economically valuable tasks, including hate speech detection, medical case report parsing, automated literature review, and more. During the development of RAFT, datasets were selected based on naturally occurring data with inherent value, and no datasets with severely imbalanced class distributions were excluded. RAFT has broad application domains, aiming to address the challenges faced by current technologies when processing long texts and multi-class tasks, and to promote the advancement of models in real-world applications.
提供机构:
Ought
创建时间:
2021-09-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作