cinepile

Name: cinepile
Creator: maas
Published: 2026-04-28 16:15:19
License: 暂无描述

魔搭社区2026-04-28 更新2024-06-08 收录

下载链接：

https://modelscope.cn/datasets/swift/cinepile

下载链接

链接失效反馈

官方服务：

资源简介：

# CinePile: A Long Video Question Answering Dataset and Benchmark CinePile is a question-answering-based, long-form video understanding dataset. It has been created using advanced large language models (LLMs) with human-in-the-loop pipeline leveraging existing human-generated raw data. It consists of approximately 300,000 training data points and 5,000 test data points. If you have any comments or questions, reach out to: [Ruchit Rawal](https://ruchitrawal.github.io/) or [Gowthami Somepalli](https://somepago.github.io/) Other links - [Website](https://ruchitrawal.github.io/cinepile/) &ensp; [Paper](https://arxiv.org/abs/2405.08813) ## Version support and revisions - October 2024: We refine both the training and test split using the adversarial refinement process described in detail [here](https://huggingface.co/blog/cinepile2). This refined version will be loaded by default when running `load_dataset("tomg-group-umd/cinepile")`. To load the previous version, use `load_dataset("tomg-group-umd/cinepile", "v1")`. ## Dataset Structure Each row in the dataset consists of a `question` (dtype: string), five `choices` (dtype: list), and an `answer_key` (dtype: string). Auxiliary columns are included that store the movie's name, movie's genre, video clip titles, etc. The train split of the dataset is intended for the instruction tuning of video-LLMs. The test split is designed for benchmarking video-LLMs and includes the `hard_split` column, which is "True" for particularly challenging questions and "False" otherwise. The `visual_reliance` column indicates whether a question likely requires integrating visual information to be answered correctly. ### Dataset Features - **movie_name**: Name of the movie to which the video clip belongs. - **year**: Release year of the movie. - **genre**: Genre(s) of the movie. - **yt_clip_title**: Title of the video clip as it appears on YouTube. - **yt_clip_link**: URL link to the video clip on YouTube. - **movie_scene**: Description of the movie scene, contains subtitles and visual descriptions. - **subtitles**: Subtitles extracted from the movie scene. - **question**: Question derived from the movie scene. - **choices**: Multiple-choice options associated with the question. - **answer_key**: The correct answer from the choices provided. - **answer_key_position**: The index position of the correct answer within the choices list. - **question_category**: The category to which the question belongs. - **hard_split**: Indicates if the question is particularly challenging. "N/A" for the train set; applicable only in the test set. - **visual_reliance**: Indicates if the question requires visual information for an accurate answer. "N/A" for the train set. ## Dataset Use and Starter Snippets ### Loading the dataset You can load the dataset easily using the Datasets library: ``` from datasets import load_dataset dataset = load_dataset("tomg-group-umd/cinepile") ``` ### Retrieving questions from a specific clip ``` cinepile_test = load_dataset('tomg-group-umd/cinepile', token=True, split='test') yt_clip_title = "Extraction (2015) - You're Crazy Scene (5/10) | Movieclips" clip_test_dataset = cinepile_test.filter(lambda x: x['yt_clip_title'] == yt_clip_title) ``` ### Loading the hard-split: ``` cinepile_test = load_dataset('tomg-group-umd/cinepile', token=True, split='test') hard_split_test = cinepile_test.filter(lambda x: x['hard_split'] == "True") ``` Please refer to the accompanying [Colab notebook](https://colab.research.google.com/drive/1jDwvPoCsg9tck3dFhVCV-h3Ny6992wCr?usp=sharing) for more examples e.g. evaluating VLMs, extracting responses, etc. ### Cite us: ``` @article{rawal2024cinepile, title={CinePile: A Long Video Question Answering Dataset and Benchmark}, author={Rawal, Ruchit and Saifullah, Khalid and Basri, Ronen and Jacobs, David and Somepalli, Gowthami and Goldstein, Tom}, journal={arXiv preprint arXiv:2405.08813}, year={2024} } ```

# CinePile：长视频问答数据集与基准测试集 CinePile是一款基于问答机制的长视频理解数据集。其构建依托先进的大语言模型（Large Language Model, LLM），采用人机协同流水线，复用已有的人工生成原始数据。该数据集包含约30万个训练样本与5000个测试样本。如有任何意见或疑问，请联系：[Ruchit Rawal](https://ruchitrawal.github.io/) 或 [Gowthami Somepalli](https://somepago.github.io/)。其他相关链接：[官网](https://ruchitrawal.github.io/cinepile/) 与 [论文](https://arxiv.org/abs/2405.08813) ## 版本支持与修订 - 2024年10月：我们通过详细阐述于[此处](https://huggingface.co/blog/cinepile2)的对抗性优化流程，对训练集与测试集划分进行了优化。运行`load_dataset("tomg-group-umd/cinepile")`时将默认加载此优化后的版本。如需加载旧版v1，请使用`load_dataset("tomg-group-umd/cinepile", "v1")`。 ## 数据集结构数据集的每一行包含一个`question`（数据类型：字符串）、五个`choices`（数据类型：列表）与一个`answer_key`（数据类型：字符串）。此外还附带辅助列，用于存储电影名称、电影类型、视频片段标题等信息。该数据集的训练集划分用于视频大语言模型的指令微调，测试集划分则用于视频大语言模型的基准测试。测试集包含`hard_split`列：若为`True`则表示该问题极具挑战性，为`False`则反之。`visual_reliance`列用于标识问题是否需要结合视觉信息才能正确作答。 ### 数据集特征 - **movie_name**：视频片段所属的电影名称。 - **year**：电影的上映年份。 - **genre**：电影的题材类型。 - **yt_clip_title**：YouTube平台上该视频片段的标题。 - **yt_clip_link**：该视频片段在YouTube上的URL链接。 - **movie_scene**：电影场景描述，包含字幕与视觉细节说明。 - **subtitles**：从该电影场景中提取的字幕内容。 - **question**：基于该电影场景生成的问题。 - **choices**：与该问题关联的多项选择选项。 - **answer_key**：对应选项中的正确答案。 - **answer_key_position**：正确答案在choices列表中的索引位置。 - **question_category**：该问题所属的类别。 - **hard_split**：标识该问题是否极具挑战性。训练集该字段为`N/A`，仅在测试集生效。 - **visual_reliance**：标识该问题是否需要依赖视觉信息才能得到准确答案。训练集该字段为`N/A`，仅在测试集生效。 ## 数据集使用与入门示例 ### 加载数据集你可以通过Datasets库轻松加载该数据集： from datasets import load_dataset dataset = load_dataset("tomg-group-umd/cinepile") ### 从指定视频片段检索问答样本 cinepile_test = load_dataset('tomg-group-umd/cinepile', token=True, split='test') yt_clip_title = "Extraction (2015) - You're Crazy Scene (5/10) | Movieclips" clip_test_dataset = cinepile_test.filter(lambda x: x['yt_clip_title'] == yt_clip_title) ### 加载困难样本子集 cinepile_test = load_dataset('tomg-group-umd/cinepile', token=True, split='test') hard_split_test = cinepile_test.filter(lambda x: x['hard_split'] == "True") 如需更多示例（如评估视觉语言模型、提取模型响应等），请参考配套的[Colab笔记本](https://colab.research.google.com/drive/1jDwvPoCsg9tck3dFhVCV-h3Ny6992wCr?usp=sharing)。 ### 引用我们 @article{rawal2024cinepile, title={CinePile: A Long Video Question Answering Dataset and Benchmark}, author={Rawal, Ruchit and Saifullah, Khalid and Basri, Ronen and Jacobs, David and Somepalli, Gowthami and Goldstein, Tom}, journal={arXiv preprint arXiv:2405.08813}, year={2024} }

提供机构：

maas

创建时间：

2024-06-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集