five

ReWatch

收藏
魔搭社区2025-12-26 更新2025-10-04 收录
下载链接:
https://modelscope.cn/datasets/zcccccz/ReWatch
下载链接
链接失效反馈
官方服务:
资源简介:
# ReWatch Dataset Our ReWatch-R1 project page: https://rewatch-r1.github.io/ ## Video Source Our video sources come from other open-source data. You need to download the original video from the following links. VideoEspresso: https://huggingface.co/datasets/hshjerry0315/VideoEspresso_train_video \ Video-R1: https://huggingface.co/datasets/Video-R1/Video-R1-data \ MiraData: https://huggingface.co/datasets/TencentARC/MiraData \ Vript: https://huggingface.co/datasets/Mutonix/Vript \ VideoMarathon: https://huggingface.co/datasets/jylins/videomarathon When using our data, you need to prepend the root directory of the video source to each video path: ```python import os import json video_dirs = { "VideoEspresso": "/path/to/your.../VideoEspresso_train_video", "Video-R1": "/path/to/your.../Video-R1", "MiraData": "/path/to/your.../mira_data", "Vript": "/path/to/your.../vript_long_videos", "VideoMarathon": "/path/to/your.../VideoMarathon" } caption_file = "caption.json" caption_data = json.load(open(caption_file, "r")) for item in caption_data: source_name = item["source"] # video source video_dir = video_dirs[source_name] # The root directory corresponding to the video source ral_video_path = item["video_path"] # Relative path abs_video_path = os.path.join(video_dir, video_dir) # Absolute path ``` For videos from MiraData, we use video_id and clip_id to determine the path of a video. You can find the video_id and clip_id of each video in the official repository of MiraData. For example, in our dataset, a video path from MiraData is: ```python video_id = "ubgB1yelZMY" clip_id = "49850.19" video_path_in_our_dataset = f"{video_id}/{clip_id}.mp4" ``` For videos from VideoMarathon, we use their id in the VideoMarathon dataset to determine the path of a video. You need to use this id to find the download path of the original video in the VideoMarathon dataset. For example, in our dataset, a video path from VideoMarathon is: ```python id = "videomarathon_03293135" video_path_in_our_dataset = f"{id}.mp4" ``` ## Data Format ### Caption ```json { "video_path": "Youcook2/merged/4/6nVIgasiUtw.mkv", "duration": 227, "caption": "[00:00-00:05] The video opens with a ...", "caption_summary": "[00:00-00:05] The video introduces ...", "source": "VideoEspresso" } ``` ### QA We use "question_id" to uniquely determine a question, The format are ```{video_path}_{question_type}_multiple_choice```or```{video_path}_{question_type}_open_end``` #### Multiple Choice ```json { "question_id": "4Z4E0IBaaRg.mp4_numerical_reasoning_multiple_choice", "multiple_choice": true, "question_type": "numerical_reasoning", "question": "By how many seconds does the time interval between the player's second and third deaths exceed the time interval between the player's first and second deaths?\nA: 5 seconds\nB: 2 seconds\nC: 7 seconds\nD: 1 second", "question_without_options": "By how many seconds does the time interval between the player's second and third deaths exceed the time interval between the player's first and second deaths?", "mc_options": { "A": "5 seconds", "B": "2 seconds", "C": "7 seconds", "D": "1 second" }, "answer": "D", "video_path": "4Z4E0IBaaRg.mp4", "source": "Vript" } ``` #### Open End ```json { "question_id": "4Z4E0IBaaRg.mp4_numerical_reasoning_open_end", "multiple_choice": false, "question_type": "numerical_reasoning", "question": "By how many seconds does the time interval between the player's second and third deaths exceed the time interval between the player's first and second deaths?", "answer": "1 second", "video_path": "4Z4E0IBaaRg.mp4", "source": "Vript" } ``` ### CoT react_chain is the interaction trajectory between the agent and the video, and natural_chain is the CoT data we use for training. ```json { "question_id": "XD-Violence/2/Kingsman.The.Golden.Circle.2017__#02-12-27_02-14-03_label_A.mp4_object_recognition_open_end", "react_chain": [ { "type": "thought", "number": 1, "content": "The question asks about named significant characters present among the guests at the grand church ceremony, aside from Eggsy, Princess Tilde, and Harry, and a distinct article of clothing worn by each. To answer this, I first need to locate the grand church ceremony in the video and identify the guests. I will start by retrieving the segment where the grand church ceremony occurs." }, { "type": "action", "number": 1, "content": "segment_retrieval(focus_event=\"the grand church ceremony, specifically the wedding ceremony of Eggsy and Princess Tilde\")" }, { "type": "observation", "number": 1, "content": "start_time: 00:53 \nend_time: 01:36" }, { "type": "thought", "number": 2, "content": "I have identified the time segment for the grand church ceremony. Now, I need to identify the significant characters present among the guests, excluding Eggsy, Princess Tilde, and Harry. I will query the segment focusing on the guests to identify them and their clothing. Since the question asks for named characters, I will focus on recognizable faces that might be identified as specific characters from the franchise." }, { "type": "action", "number": 2, "content": "segment_query(start_time=\"00:53\", end_time=\"01:36\", focus_aspect=\"guests and their distinct clothing, particularly looking for recognizable characters\")" }, { "type": "observation", "number": 2, "content": "Between 00:53 and 01:36, the wedding guests include notable characters such as Agent Tequila (Channing Tatum), dressed in a denim jacket, and Agent Ginger (Halle Berry), wearing a black dress. The congregation is diverse and features other significant figures from the film's universe in distinct, recognizable attire." }, { "type": "thought", "number": 3, "content": "The previous observation already identified two named significant characters, Agent Tequila and Agent Ginger, and described a distinct article of clothing for each. This information directly answers the question." }, { "type": "answer", "content": "The two other named significant characters present among the guests are Agent Tequila, who is dressed in a denim jacket, and Agent Ginger, who is wearing a black dress.", "number": 4 } ], "natural_chain": "To answer this, I first need to locate the grand church ceremony in the video and identify the guests. I will start by retrieving the segment where the grand church ceremony occurs. <action>I'll look for the segment focusing on the wedding ceremony of Eggsy and Princess Tilde.</action> <observation>I've identified the ceremony between 00:53 and 01:36.</observation> Now, I need to identify the significant characters present among the guests, excluding Eggsy, Princess Tilde, and Harry. I will query the segment focusing on the guests to identify them and their clothing. Since the question asks for named characters, I will focus on recognizable faces that might be identified as specific characters from the franchise. <action>I'll query the segment from 00:53 to 01:36, specifically looking for guests and their distinct clothing, particularly recognizable characters.</action> <observation>I see Agent Tequila, dressed in a denim jacket, and Agent Ginger, wearing a black dress.</observation> This information directly answers the question. <answer>The two other named significant characters present among the guests are Agent Tequila, who is dressed in a denim jacket, and Agent Ginger, who is wearing a black dress.</answer>" } ```

# ReWatch 数据集(ReWatch Dataset) 我们的ReWatch-R1项目页面:https://rewatch-r1.github.io/ ## 视频源 本数据集的视频素材均来自其他开源数据集,您需通过以下链接下载原始视频: VideoEspresso:https://huggingface.co/datasets/hshjerry0315/VideoEspresso_train_video Video-R1:https://huggingface.co/datasets/Video-R1/Video-R1-data MiraData:https://huggingface.co/datasets/TencentARC/MiraData Vript:https://huggingface.co/datasets/Mutonix/Vript VideoMarathon:https://huggingface.co/datasets/jylins/videomarathon 使用本数据集时,您需将对应视频源的根目录拼接至每条视频路径前方: python import os import json # 定义各视频源的根目录 video_dirs = { "VideoEspresso": "/path/to/your.../VideoEspresso_train_video", "Video-R1": "/path/to/your.../Video-R1", "MiraData": "/path/to/your.../mira_data", "Vript": "/path/to/your.../vript_long_videos", "VideoMarathon": "/path/to/your.../VideoMarathon" } caption_file = "caption.json" caption_data = json.load(open(caption_file, "r")) for item in caption_data: source_name = item["source"] # 视频源名称 video_dir = video_dirs[source_name] # 该视频源对应的根目录 ral_video_path = item["video_path"] # 相对路径 abs_video_path = os.path.join(video_dir, video_dir) # 拼接得到绝对路径 针对来自MiraData的视频,我们通过`video_id`(视频ID)与`clip_id`(片段ID)来确定视频路径。您可在MiraData的官方仓库中获取每条视频的video_id与clip_id。例如在本数据集中,一条来自MiraData的视频路径格式如下: python video_id = "ubgB1yelZMY" clip_id = "49850.19" video_path_in_our_dataset = f"{video_id}/{clip_id}.mp4" 针对来自VideoMarathon的视频,我们使用其在VideoMarathon数据集中的`id`来确定视频路径。您需通过该id在VideoMarathon数据集中查找原始视频的下载路径。例如在本数据集中,一条来自VideoMarathon的视频路径格式如下: python id = "videomarathon_03293135" video_path_in_our_dataset = f"{id}.mp4" ## 数据格式 ### 标注文本(Caption) json { "video_path": "Youcook2/merged/4/6nVIgasiUtw.mkv", "duration": 227, "caption": "[00:00-00:05] 视频以……开篇", "caption_summary": "[00:00-00:05] 视频介绍了……", "source": "VideoEspresso" } ### 问答(QA) 我们采用`question_id`(问题ID)作为问题的唯一标识,其格式为`{video_path}_{question_type}_multiple_choice`或`{video_path}_{question_type}_open_end`。 #### 多项选择题(Multiple Choice) json { "question_id": "4Z4E0IBaaRg.mp4_numerical_reasoning_multiple_choice", "multiple_choice": true, "question_type": "numerical_reasoning", "question": "玩家第二次与第三次死亡的时间间隔,比第一次与第二次死亡的时间间隔多多少秒? A:5秒 B:2秒 C:7秒 D:1秒", "question_without_options": "玩家第二次与第三次死亡的时间间隔,比第一次与第二次死亡的时间间隔多多少秒?", "mc_options": { "A": "5秒", "B": "2秒", "C": "7秒", "D": "1秒" }, "answer": "D", "video_path": "4Z4E0IBaaRg.mp4", "source": "Vript" } #### 开放式问答(Open End) json { "question_id": "4Z4E0IBaaRg.mp4_numerical_reasoning_open_end", "multiple_choice": false, "question_type": "numerical_reasoning", "question": "玩家第二次与第三次死亡的时间间隔,比第一次与第二次死亡的时间间隔多多少秒?", "answer": "1秒", "video_path": "4Z4E0IBaaRg.mp4", "source": "Vript" } ### 思维链(Chain of Thought,CoT) `react_chain`为AI智能体 (AI Agent) 与视频的交互轨迹,`natural_chain`为我们用于训练的思维链数据。 json { "question_id": "XD-Violence/2/Kingsman.The.Golden.Circle.2017__#02-12-27_02-14-03_label_A.mp4_object_recognition_open_end", "react_chain": [ { "type": "思考", "number": 1, "content": "该问题要求找出在盛大教堂仪式的宾客中,除埃格西、蒂尔德公主与哈利之外的其他知名角色,以及他们各自的标志性服饰。要解答该问题,我首先需要在视频中定位这场盛大的教堂仪式,并识别出席的宾客。我将先检索该仪式对应的视频片段。" }, { "type": "操作", "number": 1, "content": "segment_retrieval(focus_event="the grand church ceremony, specifically the wedding ceremony of Eggsy and Princess Tilde")" }, { "type": "观测结果", "number": 1, "content": "开始时间:00:53 结束时间:01:36" }, { "type": "思考", "number": 2, "content": "我已确定盛大教堂仪式的时间片段。接下来,我需要识别出席宾客中的知名角色(排除埃格西、蒂尔德公主与哈利),并记录他们的标志性服饰。由于问题要求找出命名角色,我将聚焦于可识别的面孔,以确定其是否为系列电影中的特定角色。" }, { "type": "操作", "number": 2, "content": "segment_query(start_time="00:53", end_time="01:36", focus_aspect="guests and their distinct clothing, particularly looking for recognizable characters")" }, { "type": "观测结果", "number": 2, "content": "在00:53至01:36的片段中,婚礼宾客包括特奎拉特工(钱宁·塔图姆 饰),身着牛仔夹克;以及金杰特工(哈莉·贝瑞 饰),身着黑色连衣裙。出席的会众多元,且包含该电影宇宙中其他身着标志性、可识别服饰的重要角色。" }, { "type": "思考", "number": 3, "content": "此前的观测结果已识别出两位命名的重要角色:特奎拉特工与金杰特工,并描述了他们各自的标志性服饰,该信息可直接解答问题。" }, { "type": "答案", "content": "出席宾客中的另外两位命名重要角色为:身着牛仔夹克的特奎拉特工,以及身着黑色连衣裙的金杰特工。", "number": 4 } ], "natural_chain": "要解答该问题,我首先需要在视频中定位这场盛大的教堂仪式,并识别出席的宾客。我将先检索该仪式对应的视频片段。 <action>我将聚焦于埃格西与蒂尔德公主的婚礼仪式相关片段。</action> <observation>我已确定仪式的时间范围为00:53至01:36。</observation> 接下来,我需要识别出席宾客中的知名角色(排除埃格西、蒂尔德公主与哈利),并记录他们的标志性服饰。由于问题要求找出命名角色,我将聚焦于可识别的面孔,以确定其是否为系列电影中的特定角色。 <action>我将查询00:53至01:36的片段,重点关注宾客及其标志性服饰,尤其是可识别的角色。</action> <observation>我看到了身着牛仔夹克的特奎拉特工,以及身着黑色连衣裙的金杰特工。</observation> 该信息可直接解答问题。 <answer>出席宾客中的另外两位命名重要角色为:身着牛仔夹克的特奎拉特工,以及身着黑色连衣裙的金杰特工。</answer>" }
提供机构:
maas
创建时间:
2025-09-28
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
ReWatch数据集是一个多源视频数据集,包含详细的视频描述、问答和思维链数据,适用于视频理解和分析任务。数据集需要用户自行下载原始视频,并按照提供的JSON格式进行处理。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作