SafeSora-Eval
收藏魔搭社区2025-11-12 更新2025-02-08 收录
下载链接:
https://modelscope.cn/datasets/PKU-Alignment/SafeSora-Eval
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for SafeSora
<span style="color: red;">Warning: this dataset contains data that may be offensive or harmful. The data are intended for research purposes, especially research that can make models less harmful. The views expressed in the data do not reflect the views of PKU-Alignment Team or any of its members. </span>
[[`🏠 Project Homepage`](https://sites.google.com/view/safe-sora)]
[[`🤗 SafeSora Datasets`](https://huggingface.co/datasets/PKU-Alignment/SafeSora)]
[[`🤗 SafeSora Label`](https://huggingface.co/datasets/PKU-Alignment/SafeSora-Label)]
[[`🤗 SafeSora Evaluation`](https://huggingface.co/datasets/PKU-Alignment/SafeSora-Eval)]
[[`⭐️ Github Repo`](https://github.com/PKU-Alignment/safe-sora)]
SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs). It currently contains three types of data:
- A classification dataset (SafeSora-Label) of 57k+ Text-Video pairs, including multi-label classification of 12 harm labels for their text prompts and text-video pairs.
- A human preference dataset (SafeSora) of 51k+ instances in the text-to-video generation task, containing comparative relationships in terms of helpfulness and harmlessness, as well as four sub-dimensions of helpfulness.
- An evaluation dataset (SafeSora-Eval) containing 600 human-written prompts, with 300 being safety-neutral and another 300 constructed according to 12 harm categories as red-team prompts.
In the future, we will also open-source some baseline alignment algorithms that utilize these datasets.
# Evaluation Dataset
The evaluation dataset contains 600 human-written prompts, including 300 safety-neutral prompts and 300 red-teaming prompts. The 300 red-teaming prompts are constructed based on 12 harmful categories. These prompts will not appear in the training set and are reserved for researchers to generate videos for model evaluation.
# SafeSora 数据集卡片
<span style="color: red;">警告:本数据集包含可能具有冒犯性或危害性的数据,仅用于研究目的,尤其是旨在降低模型危害性的相关研究。数据中表达的观点不代表北京大学对齐团队(PKU-Alignment Team)及其任何成员的立场。</span>
[[`🏠 项目主页`](https://sites.google.com/view/safe-sora)]
[[`🤗 SafeSora 数据集`](https://huggingface.co/datasets/PKU-Alignment/SafeSora)]
[[`🤗 SafeSora 标注数据集`](https://huggingface.co/datasets/PKU-Alignment/SafeSora-Label)]
[[`🤗 SafeSora 评估数据集`](https://huggingface.co/datasets/PKU-Alignment/SafeSora-Eval)]
[[`⭐️ GitHub 仓库`](https://github.com/PKU-Alignment/safe-sora)]
SafeSora是一款面向文本到视频生成领域安全对齐研究的人类偏好数据集,旨在提升大视觉模型(Large Vision Models, LVMs)的有用性与无害性。目前该数据集包含三类数据:
- 标注数据集(SafeSora-Label):包含57k+条文本-视频对,支持针对其文本提示与文本-视频对的12种有害标签多分类任务。
- 人类偏好数据集(SafeSora):包含51k+个文本到视频生成任务样本,涵盖有用性与无害性维度的比较关系,以及有用性的4个子维度。
- 评估数据集(SafeSora-Eval):包含600条人工撰写的提示词,其中300条为安全中性提示词,剩余300条依据12种有害类别构建为红队测试提示词。
后续我们还将开源基于上述数据集开发的若干基线对齐算法。
# 评估数据集
本评估数据集包含600条人工撰写的提示词,其中300条为安全中性提示词,300条为红队测试提示词。该300条红队测试提示词基于12种有害类别构建,且不会出现在训练集中,供研究人员生成视频以开展模型评估工作。
提供机构:
maas
创建时间:
2025-02-07



