tulu-3-do-anything-now-eval
收藏魔搭社区2025-11-27 更新2025-05-31 收录
下载链接:
https://modelscope.cn/datasets/allenai/tulu-3-do-anything-now-eval
下载链接
链接失效反馈官方服务:
资源简介:
This data comes from the [Do Anything Now](https://arxiv.org/abs/2308.03825) benchmark.
This is one of the datasets included in the [Ai2 Safety Evaluation Suite](https://github.com/allenai/safety-eval), and the [Tülu 3](https://arxiv.org/abs/2411.15124v1) evaluation suite.
The repo for Ai2's safety suite includes instructions on how to evaluate models on various safety-related evaluation including this one.
本数据集源自[任意行事(Do Anything Now,https://arxiv.org/abs/2308.03825)]基准测试集。
该数据集同时被收录于[AI2安全评估套件(Ai2 Safety Evaluation Suite,https://github.com/allenai/safety-eval)]与[Tülu 3评估套件(https://arxiv.org/abs/2411.15124v1)]中。
AI2安全评估套件的代码仓库内包含了针对各类安全相关评估任务(含本数据集对应的评估任务)的模型评估操作指南。
提供机构:
maas
创建时间:
2025-05-28



