text-2-video-human-preferences-veo3
收藏魔搭社区2025-12-05 更新2025-05-31 收录
下载链接:
https://modelscope.cn/datasets/Rapidata/text-2-video-human-preferences-veo3
下载链接
链接失效反馈官方服务:
资源简介:
<style>
.vertical-container {
display: flex;
flex-direction: column;
gap: 60px;
}
.image-container img {
height: 150px; /* Set the desired height */
margin:0;
object-fit: contain; /* Ensures the aspect ratio is maintained */
width: auto; /* Adjust width automatically based on height */
}
.image-container {
display: flex; /* Aligns images side by side */
justify-content: space-around; /* Space them evenly */
align-items: center; /* Align them vertically */
}
.container {
width: 90%;
margin: 0 auto;
}
.text-center {
text-align: center;
}
.score-amount {
margin: 0;
margin-top: 10px;
}
.score-percentage {
font-size: 12px;
font-weight: semi-bold;
}
</style>
# Rapidata Video Generation Veo 3 Human Preference
<a href="https://www.rapidata.ai">
<img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="300" alt="Dataset visualization">
</a>
<a href="https://huggingface.co/datasets/Rapidata/text-2-image-Rich-Human-Feedback">
</a>
In this dataset, ~46k human responses from ~20k human annotators were collected to evaluate Veo3 video generation model on our benchmark. This dataset was collected in roughly 35 minutes using the [Rapidata Python API](https://docs.rapidata.ai), accessible to anyone and ideal for large scale data annotation.
Explore our latest model rankings on our [website](https://www.rapidata.ai/benchmark).
If you get value from this dataset and would like to see more in the future, please consider liking it ❤️
# Overview
In this dataset, ~46k human responses from ~20k human annotators were collected in roughly 35 minutes to evaluate Veo3 video generation model on our benchmark. The up to date benchmark can be viewed on our [website](https://www.rapidata.ai/leaderboard/video-models).
The benchmark data is accessible on [huggingface](https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences) directly.
# Explanation of the colums
The dataset contains paired video comparisons. Each entry includes 'video1' and 'video2' fields, which contain links to downscaled GIFs for easy viewing. The full-resolution videos can be found [here](https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences-veo3/tree/main/videos)
The weighted_results column contains scores ranging from 0 to 1, representing aggregated user responses. Individual user responses can be found in the detailedResults column.
# Alignment
The alignment score quantifies how well an video matches its prompt. Users were asked: "Which video fits the description better?".
## Examples
<div class="vertical-container">
<div class="container">
<div class="text-center">
<q>A vibrant 2D animation of a young skateboarder in a colorful outfit performing tricks through a lively city park. Bold lines and bright hues create an energetic, playful atmosphere as the skateboarder maneuvers around obstacles.</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Veo 3 </h3>
<div class="score-percentage">(Score: 91.37%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/veo3_0005_0.gif" width=500>
</div>
<div>
<h3 class="score-amount">Ray 2 </h3>
<div class="score-percentage">(Score: 8.63%)</div>
<img src="https://assets.rapidata.ai/0005_ray2_1.gif" width=500>
</div>
</div>
</div>
<div class="container">
<div class="text-center">
<q>A cinematic close-up of a barista crafting latte art in a bustling coffee shop. The scene alternates between her focused, skilled hands and customers watching appreciatively, highlighting the artistry and dedication in everyday routines.</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Veo 3 </h3>
<div class="score-percentage">(Score: 42.64%)</div>
<img src="https://assets.rapidata.ai/0020_veo3_0.gif" width=500>
</div>
<div>
<h3 class="score-amount">Veo 2 </h3>
<div class="score-percentage">(Score: 57.36%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/Xi3pc4wXjSx_zZdbv36mI.gif" width=500>
</div>
</div>
</div>
</div>
# Coherence
The coherence score measures whether the generated video is logically consistent and free from artifacts or visual glitches. Without seeing the original prompt, users were asked: "Which video has more glitches and is more likely to be AI generated?"
## Examples
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3>Veo 3 </h3>
<div class="score-percentage">(Glitch Rating: 4.52%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/0064_veo3_0.gif" width="500" alt="Dataset visualization">
</div>
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(Glitch Rating: 95.48%)</div>
<img src="https://assets.rapidata.ai/0064_pika2.2_1.gif" width="500" alt="Dataset visualization">
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3>Veo 3 </h3>
<div class="score-percentage">(Glitch Rating: 96.66%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/rX8EsklGFUExyVx_eyxU9.gif" width="500" alt="Dataset visualization">
</div>
<div>
<h3>Veo 2 </h3>
<div class="score-percentage">(Glitch Rating: 3.34%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/0043_veo2_0.gif" width="500" alt="Dataset visualization">
</div>
</div>
</div>
</div>
# Preference
The preference score reflects how visually appealing participants found each video, independent of the prompt. Users were asked: "Which video do you prefer aesthetically?"
## Examples
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3>Veo 3 </h3>
<div class="score-percentage">(Score: 94.60%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/0095_veo3_0.gif" width="500" alt="Dataset visualization">
</div>
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(Score: 5.40%)</div>
<img src="https://assets.rapidata.ai/0095_pika2.2_1.gif" width="500" alt="Dataset visualization">
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3>Veo 3 </h3>
<div class="score-percentage">(Score: 33.78%)</div>
<img src="https://assets.rapidata.ai/0028_veo3_0.gif" width="500" alt="Dataset visualization">
</div>
<div>
<h3>Alpha </h3>
<div class="score-percentage">(Score: 66.22%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/0028_alpha_1688981275.gif" width="500" alt="Dataset visualization">
</div>
</div>
</div>
</div>
</br>
# About Rapidata
Rapidata's technology makes collecting human feedback at scale faster and more accessible than ever before. Visit [rapidata.ai](https://www.rapidata.ai/) to learn more about how we're revolutionizing human feedback collection for AI development.
# Other Datasets
We run a benchmark of the major video generation models, the results can be found on our [website](https://www.rapidata.ai/leaderboard/video-models). We rank the models according to their coherence/plausiblity, their aligment with the given prompt and style prefernce. The underlying 2M+ annotations can be found here:
- Link to the [Rich Video Annotation dataset](https://huggingface.co/datasets/Rapidata/text-2-video-Rich-Human-Feedback)
- Link to the [Coherence dataset](https://huggingface.co/datasets/Rapidata/Flux_SD3_MJ_Dalle_Human_Coherence_Dataset)
- Link to the [Text-2-Image Alignment dataset](https://huggingface.co/datasets/Rapidata/Flux_SD3_MJ_Dalle_Human_Alignment_Dataset)
- Link to the [Preference dataset](https://huggingface.co/datasets/Rapidata/700k_Human_Preference_Dataset_FLUX_SD3_MJ_DALLE3)
<style>
.vertical-container {
display: flex;
flex-direction: column;
gap: 60px;
}
.image-container img {
height: 150px; /* Set the desired height */
margin:0;
object-fit: contain; /* Ensures the aspect ratio is maintained */
width: auto; /* Adjust width automatically based on height */
}
.image-container {
display: flex; /* Aligns images side by side */
justify-content: space-around; /* Space them evenly */
align-items: center; /* Align them vertically */
}
.container {
width: 90%;
margin: 0 auto;
}
.text-center {
text-align: center;
}
.score-amount {
margin: 0;
margin-top: 10px;
}
.score-percentage {
font-size: 12px;
font-weight: semi-bold;
}
</style>
# Rapidata 视频生成 Veo 3 人类偏好数据集
<a href="https://www.rapidata.ai">
<img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="300" alt="数据集可视化">
</a>
<a href="https://huggingface.co/datasets/Rapidata/text-2-image-Rich-Human-Feedback">
</a>
本数据集收集了约2万名人类标注者的共计约4.6万条人类反馈,用于在我们的基准测试中评估Veo3视频生成模型。本数据集通过[Rapidata Python API](https://docs.rapidata.ai)仅用约35分钟即可完成采集,面向所有用户开放,非常适用于大规模数据标注任务。
您可访问我们的[官网](https://www.rapidata.ai)查看最新的模型排名。
若本数据集对您的研究有所助益,并希望后续获取更多相关资源,欢迎为数据集点赞 ❤️
## 数据集概览
本数据集共收集约2万名人类标注者的4.6万条反馈,耗时约35分钟完成,用于在我们的基准测试中评估Veo3视频生成模型。最新版基准测试结果可通过[官网](https://www.rapidata.ai/leaderboard/video-models)查看,基准测试数据可直接在[Hugging Face](https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences)获取。
## 字段说明
本数据集包含成对视频对比样本。每条数据均包含`video1`与`video2`字段,其中存储了用于快速预览的压缩版GIF链接。全分辨率视频可通过[此链接](https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences-veo3/tree/main/videos)获取。
`weighted_results`字段包含0至1区间的分数,代表聚合后的用户反馈结果;单条用户的原始反馈可在`detailedResults`字段中查看。
## 对齐度
对齐度分数用于量化视频与对应提示词的匹配程度。标注者被要求回答:“哪个视频更贴合给定的描述?”
### 示例
<div class="vertical-container">
<div class="container">
<div class="text-center">
<q>一段充满活力的2D动画,展现一名身着多彩服饰的年轻滑板爱好者在热闹的城市公园中完成滑板特技。流畅的线条与明快的色彩营造出活泼动感的氛围,滑板者灵活穿梭于各类障碍物之间。</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Veo 3 </h3>
<div class="score-percentage">(得分:91.37%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/veo3_0005_0.gif" width=500>
</div>
<div>
<h3 class="score-amount">Ray 2 </h3>
<div class="score-percentage">(得分:8.63%)</div>
<img src="https://assets.rapidata.ai/0005_ray2_1.gif" width=500>
</div>
</div>
</div>
<div class="container">
<div class="text-center">
<q>咖啡馆内的电影级特写镜头,展现咖啡师制作拉花的过程。镜头在她专注娴熟的双手与满怀欣赏的顾客之间切换,凸显日常工作中的艺术感与投入感。</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Veo 3 </h3>
<div class="score-percentage">(得分:42.64%)</div>
<img src="https://assets.rapidata.ai/0020_veo3_0.gif" width=500>
</div>
<div>
<h3 class="score-amount">Veo 2 </h3>
<div class="score-percentage">(得分:57.36%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/Xi3pc4wXjSx_zZdbv36mI.gif" width=500>
</div>
</div>
</div>
</div>
## 连贯性
连贯性分数用于衡量生成视频的逻辑一致性,以及是否存在视觉伪影或其他视觉瑕疵。标注者在不查看原始提示词的前提下,被要求回答:“哪个视频存在更多瑕疵,更像是AI生成的?”
### 示例
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3>Veo 3 </h3>
<div class="score-percentage">(瑕疵评级:4.52%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/0064_veo3_0.gif" width="500" alt="数据集可视化">
</div>
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(瑕疵评级:95.48%)</div>
<img src="https://assets.rapidata.ai/0064_pika2.2_1.gif" width="500" alt="数据集可视化">
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3>Veo 3 </h3>
<div class="score-percentage">(瑕疵评级:96.66%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/rX8EsklGFUExyVx_eyxU9.gif" width="500" alt="数据集可视化">
</div>
<div>
<h3>Veo 2 </h3>
<div class="score-percentage">(瑕疵评级:3.34%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/0043_veo2_0.gif" width="500" alt="数据集可视化">
</div>
</div>
</div>
</div>
## 偏好度
偏好度分数反映参与者对视频的视觉美观程度评价,与提示词无关。标注者被要求回答:“你更倾向于选择哪个视频的美学风格?”
### 示例
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3>Veo 3 </h3>
<div class="score-percentage">(得分:94.60%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/0095_veo3_0.gif" width="500" alt="数据集可视化">
</div>
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(得分:5.40%)</div>
<img src="https://assets.rapidata.ai/0095_pika2.2_1.gif" width="500" alt="数据集可视化">
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3>Veo 3 </h3>
<div class="score-percentage">(得分:33.78%)</div>
<img src="https://assets.rapidata.ai/0028_veo3_0.gif" width="500" alt="数据集可视化">
</div>
<div>
<h3>Alpha </h3>
<div class="score-percentage">(得分:66.22%)</div>
<img style="border: 5px solid #18c54f;" src="https://assets.rapidata.ai/0028_alpha_1688981275.gif" width="500" alt="数据集可视化">
</div>
</div>
</div>
</div>
</br>
## 关于Rapidata
Rapidata的技术让大规模人类反馈的采集变得更快速、更易用。访问[rapidata.ai](https://www.rapidata.ai/)了解更多关于我们如何革新AI开发中的人类反馈采集技术的信息。
## 其他数据集
我们运营了主流视频生成模型的基准测试,相关结果可在[官网](https://www.rapidata.ai/leaderboard/video-models)查看。我们将根据模型的连贯性/合理性、与提示词的对齐度以及风格偏好度进行排名。背后的200万+条标注数据可通过以下链接获取:
- [Rich Video Annotation数据集](https://huggingface.co/datasets/Rapidata/text-2-video-Rich-Human-Feedback)
- [连贯性数据集](https://huggingface.co/datasets/Rapidata/Flux_SD3_MJ_Dalle_Human_Coherence_Dataset)
- [文本-图像对齐数据集](https://huggingface.co/datasets/Rapidata/Flux_SD3_MJ_Dalle_Human_Alignment_Dataset)
- [偏好数据集](https://huggingface.co/datasets/Rapidata/700k_Human_Preference_Dataset_FLUX_SD3_MJ_DALLE3)
提供机构:
maas
创建时间:
2025-05-28
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集收集了约4.6万条人类反馈,来自约2万名标注者,旨在评估Veo3视频生成模型的对齐度、连贯性和偏好性。数据通过Rapidata平台在约35分钟内快速收集,包含视频比较和评分信息。
以上内容由遇见数据集搜集并总结生成



