five

sora-video-generation-aligned-words

收藏
魔搭社区2025-12-05 更新2025-02-08 收录
下载链接:
https://modelscope.cn/datasets/Rapidata/sora-video-generation-aligned-words
下载链接
链接失效反馈
官方服务:
资源简介:
<style> .vertical-container { display: flex; flex-direction: column; gap: 60px; } .image-container img { height: 250px; /* Set the desired height */ margin:0; object-fit: contain; /* Ensures the aspect ratio is maintained */ width: auto; /* Adjust width automatically based on height */ } .image-container { display: flex; /* Aligns images side by side */ justify-content: space-around; /* Space them evenly */ align-items: center; /* Align them vertically */ } .container { width: 90%; margin: 0 auto; } .prompt { width: 100%; text-align: center; font-weight: bold; font-size: 16px; height: 60px; } .score-amount { margin: 0; margin-top: 10px; } .score-percentage { font-size: 12px; font-weight: semi-bold; text-align: right; } .main-container { display: flex; flex-direction: row; gap: 60px; } .good { color: #18c54f; } .bad { color: red; } </style> # Rapidata Video Generation Word for Word Alignment Dataset <a href="https://www.rapidata.ai"> <img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="300" alt="Dataset visualization"> </a> <a href="https://huggingface.co/datasets/Rapidata/text-2-image-Rich-Human-Feedback"> </a> <p> If you get value from this dataset and would like to see more in the future, please consider liking it. </p> This dataset was collected in ~1 hour using the [Rapidata Python API](https://docs.rapidata.ai), accessible to anyone and ideal for large scale data annotation. # Overview In this dataset, ~1500 human evaluators were asked to evaluate AI-generated videos based on what part of the prompt did not align the video. The specific instruction was: "The video is based on the text below. Select mistakes, i.e., words that are not aligned with the video." The dataset is based on the [Alignment Dataset](https://huggingface.co/datasets/Rapidata/sora-video-generation-alignment-likert-scoring). The videos that scored above a 0.5 (were worse) in the "LikertScoreNormalized" were selected to be analyzed in detail. # Videos The videos in the dataset viewer are previewed as scaled down gifs. The original videos are stored under [Files and versions](https://huggingface.co/datasets/Rapidata/sora-video-generation-aligned-words/tree/main/Videos) <h3> The video is based on the text below. Select mistakes, i.e., words that are not aligned with the video. </h3> <div class="main-container"> <div class="container"> <div class="image-container"> <div> <img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/L5ncdW_-mKfT14Rn2-0X1.gif" width=500> </div> </div> </div> <div class="container"> <div class="image-container"> <div> <img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/WTkh6PSn84c9KOK9EnhbV.gif" width=500> </div> </div> </div> </div>

<style> .vertical-container { display: flex; flex-direction: column; gap: 60px; } .image-container img { height: 250px; /* Set the desired height */ margin:0; object-fit: contain; /* Ensures the aspect ratio is maintained */ width: auto; /* Adjust width automatically based on height */ } .image-container { display: flex; /* Aligns images side by side */ justify-content: space-around; /* Space them evenly */ align-items: center; /* Align them vertically */ } .container { width: 90%; margin: 0 auto; } .prompt { width: 100%; text-align: center; font-weight: bold; font-size: 16px; height: 60px; } .score-amount { margin: 0; margin-top: 10px; } .score-percentage { font-size: 12px; font-weight: semi-bold; text-align: right; } .main-container { display: flex; flex-direction: row; gap: 60px; } .good { color: #18c54f; } .bad { color: red; } </style> # Rapidata视频生成逐词对齐数据集(Rapidata) <a href="https://www.rapidata.ai"> <img src="https://cdn-uploads.huggingface.co/production/uploads/66f5624c42b853e73e0738eb/jfxR79bOztqaC6_yNNnGU.jpeg" width="300" alt="数据集可视化"> </a> <a href="https://huggingface.co/datasets/Rapidata/text-2-image-Rich-Human-Feedback"> </a> <p>若您从本数据集获益并希望未来看到更多相关资源,不妨为其点赞。</p> 本数据集通过[Rapidata Python API(Rapidata Python API)](https://docs.rapidata.ai)耗时约1小时完成采集,面向所有用户开放,是大规模数据标注的理想选择。 ## 数据集概况 本数据集共招募约1500名人类评估员,要求其基于提示词(Prompt)与生成视频的不匹配部分,对AI生成视频进行评估。具体评估指令为:「该视频基于以下文本生成,请勾选与视频内容不匹配的错误词汇。」 本数据集基于[对齐数据集(Alignment Dataset)](https://huggingface.co/datasets/Rapidata/sora-video-generation-alignment-likert-scoring)构建,筛选出「标准化李克特评分(LikertScoreNormalized)」得分高于0.5(即视频质量较差)的视频进行详细分析。 ## 视频说明 数据集查看器中的视频以压缩GIF格式预览,原始视频存储于[文件与版本(Files and versions)](https://huggingface.co/datasets/Rapidata/sora-video-generation-aligned-words/tree/main/Videos)路径下 <h3>评估指令:该视频基于以下文本生成,请勾选与视频内容不匹配的错误词汇。</h3> <div class="main-container"> <div class="container"> <div class="image-container"> <div> <img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/L5ncdW_-mKfT14Rn2-0X1.gif" width=500> </div> </div> </div> <div class="container"> <div class="image-container"> <div> <img src="https://cdn-uploads.huggingface.co/production/uploads/672b7d79fd1e92e3c3567435/WTkh6PSn84c9KOK9EnhbV.gif" width=500> </div> </div> </div> </div>
提供机构:
maas
创建时间:
2025-02-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作