text-2-video-human-preferences-pika2.2
收藏魔搭社区2025-12-05 更新2025-04-26 收录
下载链接:
https://modelscope.cn/datasets/Rapidata/text-2-video-human-preferences-pika2.2
下载链接
链接失效反馈官方服务:
资源简介:
<style>
.vertical-container {
display: flex;
flex-direction: column;
gap: 60px;
}
.image-container img {
height: 150px; /* Set the desired height */
margin:0;
object-fit: contain; /* Ensures the aspect ratio is maintained */
width: auto; /* Adjust width automatically based on height */
}
.image-container {
display: flex; /* Aligns images side by side */
justify-content: space-around; /* Space them evenly */
align-items: center; /* Align them vertically */
}
.container {
width: 90%;
margin: 0 auto;
}
.text-center {
text-align: center;
}
.score-amount {
margin: 0;
margin-top: 10px;
}
.score-percentage {
font-size: 12px;
font-weight: semi-bold;
}
</style>
# Rapidata Video Generation Pika 2.2 Human Preference
<a href="https://www.rapidata.ai">
<img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="300" alt="Dataset visualization">
</a>
<a href="https://huggingface.co/datasets/Rapidata/text-2-image-Rich-Human-Feedback">
</a>
In this dataset, ~756k human responses from ~29k human annotators were collected to evaluate Pika 2.2 video generation model on our benchmark. This dataset was collected in ~1 day total using the [Rapidata Python API](https://docs.rapidata.ai), accessible to anyone and ideal for large scale data annotation.
Explore our latest model rankings on our [website](https://www.rapidata.ai/benchmark).
If you get value from this dataset and would like to see more in the future, please consider liking it ❤️
# Overview
In this dataset, ~756k human responses from ~29k human annotators were collected to evaluate Pika 2.2 video generation model on our benchmark. The up to date benchmark can be viewed on our [website](https://www.rapidata.ai/leaderboard/video-models).
The benchmark data is accessible on [huggingface](https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences) directly.
# Explanation of the colums
The dataset contains paired video comparisons. Each entry includes 'video1' and 'video2' fields, which contain links to downscaled GIFs for easy viewing. The full-resolution videos can be found [here](https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences-pika2.2/tree/main/videos)
The weighted_results column contains scores ranging from 0 to 1, representing aggregated user responses. Individual user responses can be found in the detailedResults column.
# Alignment
The alignment score quantifies how well an video matches its prompt. Users were asked: "Which video fits the description better?".
## Examples
<div class="vertical-container">
<div class="container">
<div class="text-center">
<q>A lone kayaker paddles through calm, reflecting waters under a vibrant sunset, the sky painted with hues of orange and pink, creating a serene and mesmerizing evening scene.</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Pika 2.2 </h3>
<div class="score-percentage">(Score: 95.5%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/Mgt-9uw47AHEyBwBbZ9bI.webp" width=500>
</div>
<div>
<h3 class="score-amount">Hunyuan </h3>
<div class="score-percentage">(Score: 4.5%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/dAOZZwKkbTdqCsergcIC8.webp" width=500>
</div>
</div>
</div>
<div class="container">
<div class="text-center">
<q>A colorful 2D animation of a quirky raccoon band jamming under a starry sky. Each raccoon plays a different instrument, occasionally stumbling over cables and causing playful chaos, adding charm and fun to their nighttime performance.</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Pika 2.2 </h3>
<div class="score-percentage">(Score: 16.9%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/2ZuvKbI7eM2X4n1ubGrdv.webp" width=500>
</div>
<div>
<h3 class="score-amount">Sora </h3>
<div class="score-percentage">(Score: 83.1%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/aU-j7njRQyZ5myN09LdXi.webp" width=500>
</div>
</div>
</div>
</div>
# Coherence
The coherence score measures whether the generated video is logically consistent and free from artifacts or visual glitches. Without seeing the original prompt, users were asked: "Which video is logically more coherent? E.g. the video where physics are less violated and the composition makes more sense."
## Examples
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(Score: 76.7%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/1xgSXTkVqlCMjgmwM6Yog.webp" width="500" alt="Dataset visualization">
</div>
<div>
<h3>Hunyuan </h3>
<div class="score-percentage">(Score: 23.3%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/ki7WWc9uS_iBEi2tEcVzU.webp" width="500" alt="Dataset visualization">
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(Score: 11.7%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/46G8_blKUuhcbDaMskm3P.webp" width="500" alt="Dataset visualization">
</div>
<div>
<h3>Veo 2 </h3>
<div class="score-percentage">(Score: 88.3%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/XY7LrBZjQVV3iabBgCDb4.webp" width="500" alt="Dataset visualization">
</div>
</div>
</div>
</div>
# Preference
The preference score reflects how visually appealing participants found each video, independent of the prompt. Users were asked: "Which video do you prefer aesthetically?"
## Examples
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(Score: 91.8%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/hsvgSxsSpN1a-HVugrk7A.webp" width="500" alt="Dataset visualization">
</div>
<div>
<h3>Ray 2 </h3>
<div class="score-percentage">(Score: 8.2%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/_SVQk_xhydkGyXtcbqq5D.webp" width="500" alt="Dataset visualization">
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(Score: 22.1%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/lRaiWhSWaLOaGc-FbX3ah.webp" width="500" alt="Dataset visualization">
</div>
<div>
<h3>Alpha </h3>
<div class="score-percentage">(Score: 77.9%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/tKh_iDkcsumNeZoMba6at.webp" width="500" alt="Dataset visualization">
</div>
</div>
</div>
</div>
</br>
# About Rapidata
Rapidata's technology makes collecting human feedback at scale faster and more accessible than ever before. Visit [rapidata.ai](https://www.rapidata.ai/) to learn more about how we're revolutionizing human feedback collection for AI development.
# Other Datasets
We run a benchmark of the major video generation models, the results can be found on our [website](https://www.rapidata.ai/leaderboard/video-models). We rank the models according to their coherence/plausiblity, their aligment with the given prompt and style prefernce. The underlying 2M+ annotations can be found here:
- Link to the [Rich Video Annotation dataset](https://huggingface.co/datasets/Rapidata/text-2-video-Rich-Human-Feedback)
- Link to the [Coherence dataset](https://huggingface.co/datasets/Rapidata/Flux_SD3_MJ_Dalle_Human_Coherence_Dataset)
- Link to the [Text-2-Image Alignment dataset](https://huggingface.co/datasets/Rapidata/Flux_SD3_MJ_Dalle_Human_Alignment_Dataset)
- Link to the [Preference dataset](https://huggingface.co/datasets/Rapidata/700k_Human_Preference_Dataset_FLUX_SD3_MJ_DALLE3)
We have also colleted a [rich human feedback dataset](https://huggingface.co/datasets/Rapidata/text-2-video-Rich-Human-Feedback), where we annotated an alignment score of each word in a prompt, scored coherence, overall aligment and style preferences and finally annotated heatmaps of areas of interest for those videos with low scores.
# Rapidata 视频生成 Pika 2.2 人类偏好数据集
<a href="https://www.rapidata.ai">
<img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="300" alt="数据集可视化">
</a>
<a href="https://huggingface.co/datasets/Rapidata/text-2-image-Rich-Human-Feedback">
</a>
本数据集共收集约2.9万名人类标注者的75.6万条有效反馈,用于在我们的基准测试中评估Pika 2.2视频生成模型。本数据集全程仅耗时约1天完成采集,通过[Rapidata Python API](https://docs.rapidata.ai)即可获取,适用于任意场景,尤其适合大规模数据标注工作。
可访问我们的[官网](https://www.rapidata.ai/benchmark)查看最新的模型排名。
若本数据集对您的研究有所助益并希望后续获取更多同类资源,欢迎点赞支持❤️
## 数据集概览
本数据集共收集约2.9万名人类标注者的75.6万条反馈,用于在我们的基准测试中评估Pika 2.2视频生成模型。最新版基准测试结果可在我们的[官网](https://www.rapidata.ai/leaderboard/video-models)查看。基准测试原始数据可直接通过[Hugging Face](https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences)获取。
## 字段说明
本数据集包含成对视频对比样本。每条数据均包含`video1`与`video2`字段,其中存储了用于快速预览的降采样GIF链接。全分辨率视频可通过[此链接](https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences-pika2.2/tree/main/videos)获取。
`weighted_results`字段存储0至1区间的分值,代表聚合后的用户反馈结果;单条用户的原始反馈则可在`detailedResults`字段中查看。
## 对齐度
对齐分值用于量化生成视频与对应提示词的匹配程度。标注者需回答:「哪段视频更贴合给定的文本描述?」
### 示例
<div class="vertical-container">
<div class="container">
<div class="text-center">
<q>一名独自划桨的皮划艇运动员在静谧如镜的水面行进,绚烂日落之下,天空被橙粉交织的色彩晕染,勾勒出宁静又令人沉醉的黄昏图景。</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Pika 2.2 </h3>
<div class="score-percentage">(得分:95.5%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/Mgt-9uw47AHEyBwBbZ9bI.webp" width="500">
</div>
<div>
<h3 class="score-amount">Hunyuan </h3>
<div class="score-percentage">(得分:4.5%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/dAOZZwKkbTdqCsergcIC8.webp" width="500">
</div>
</div>
</div>
<div class="container">
<div class="text-center">
<q>一段色彩丰富的2D动画:一群古灵精怪的浣熊乐队在星空下即兴演奏,每只浣熊都演奏着不同的乐器,时而被线缆绊倒,引发充满趣味的小混乱,为这场夜间演出增添了可爱与欢乐。</q>
</div>
<div class="image-container">
<div>
<h3 class="score-amount">Pika 2.2 </h3>
<div class="score-percentage">(得分:16.9%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/2ZuvKbI7eM2X4n1ubGrdv.webp" width="500">
</div>
<div>
<h3 class="score-amount">Sora </h3>
<div class="score-percentage">(得分:83.1%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/aU-j7njRQyZ5myN09LdXi.webp" width="500">
</div>
</div>
</div>
</div>
## 连贯性
连贯性分值用于评估生成视频的逻辑自洽性,以及是否存在视觉伪影或画面瑕疵。标注者在不查看原始提示词的前提下需回答:「哪段视频的逻辑连贯性更强?例如更符合物理规则、画面构图更合理的视频。」
### 示例
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(得分:76.7%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/1xgSXTkVqlCMjgmwM6Yog.webp" width="500" alt="数据集可视化">
</div>
<div>
<h3>Hunyuan </h3>
<div class="score-percentage">(得分:23.3%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/ki7WWc9uS_iBEi2tEcVzU.webp" width="500">
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(得分:11.7%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/46G8_blKUuhcbDaMskm3P.webp" width="500" alt="数据集可视化">
</div>
<div>
<h3>Veo 2 </h3>
<div class="score-percentage">(得分:88.3%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/XY7LrBZjQVV3iabBgCDb4.webp" width="500">
</div>
</div>
</div>
</div>
## 审美偏好
审美偏好分值用于体现参与者对单段视频的视觉美观度评价,与原始提示词无关。标注者需回答:「从审美角度而言,你更偏好哪段视频?」
### 示例
<div class="vertical-container">
<div class="container">
<div class="image-container">
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(得分:91.8%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/hsvgSxsSpN1a-HVugrk7A.webp" width="500" alt="数据集可视化">
</div>
<div>
<h3>Ray 2 </h3>
<div class="score-percentage">(得分:8.2%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/_SVQk_xhydkGyXtcbqq5D.webp" width="500" alt="数据集可视化">
</div>
</div>
</div>
<div class="container">
<div class="image-container">
<div>
<h3>Pika 2.2 </h3>
<div class="score-percentage">(得分:22.1%)</div>
<img src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/lRaiWhSWaLOaGc-FbX3ah.webp" width="500" alt="数据集可视化">
</div>
<div>
<h3>Alpha </h3>
<div class="score-percentage">(得分:77.9%)</div>
<img style="border: 5px solid #18c54f;" src="https://cdn-uploads.huggingface.co/production/uploads/664dcc6296d813a7e15e170e/tKh_iDkcsumNeZoMba6at.webp" width="500" alt="数据集可视化">
</div>
</div>
</div>
</div>
<br>
## 关于Rapidata
Rapidata的技术让大规模人类反馈采集工作变得前所未有的高效便捷。欢迎访问[rapidata.ai](https://www.rapidata.ai/),了解我们如何革新AI研发领域的人类反馈采集流程。
## 其他数据集
我们针对主流视频生成模型开展了基准测试,测试结果可在[官网](https://www.rapidata.ai/leaderboard/video-models)查看。我们依据模型的连贯性/合理性、与提示词的对齐程度以及风格偏好进行排名。相关的200万+条标注数据可通过以下链接获取:
- [丰富视频标注数据集](https://huggingface.co/datasets/Rapidata/text-2-video-Rich-Human-Feedback)
- [连贯性数据集](https://huggingface.co/datasets/Rapidata/Flux_SD3_MJ_Dalle_Human_Coherence_Dataset)
- [文图对齐数据集](https://huggingface.co/datasets/Rapidata/Flux_SD3_MJ_Dalle_Human_Alignment_Dataset)
- [偏好数据集](https://huggingface.co/datasets/Rapidata/700k_Human_Preference_Dataset_FLUX_SD3_MJ_DALLE3)
我们还收集了[丰富人类反馈数据集](https://huggingface.co/datasets/Rapidata/text-2-video-Rich-Human-Feedback),针对提示词中的每个字词标注对齐分值,同时对视频的连贯性、整体对齐程度与风格偏好进行打分,并最终为低分视频生成感兴趣区域的热力图标注。
提供机构:
maas
创建时间:
2025-04-22



