longvideotool/LongVT-Parquet

Name: longvideotool/LongVT-Parquet
Creator: longvideotool
Published: 2025-12-10 19:07:47
License: 暂无描述

Hugging Face2025-12-10 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/longvideotool/LongVT-Parquet

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - video-text-to-text - visual-question-answering language: - en tags: - video - long-video - reasoning - tool-calling - multimodal - chain-of-thought size_categories: - 100K<n<1M configs: - config_name: rft data_files: - split: train path: longvt_rft_selftrace_15k3.parquet - config_name: rl data_files: - split: train path: longvt_rl_selfqa_1k6.parquet - split: val path: longvt_rl_val_114.parquet - config_name: sft data_files: - split: geminicot path: longvt_sft_geminicot_4k8.parquet - split: llavacot path: longvt_sft_llavacot_54k5.parquet - split: longvideoreason path: longvt_sft_longvideoreason_5k2.parquet - split: longvideoreflection path: longvt_sft_longvideoreflection_3k.parquet - split: openvlthinker path: longvt_sft_openvlthinker_2k8.parquet - split: tvg path: longvt_sft_tvg_6k3.parquet - split: videor1 path: longvt_sft_videor1_165k5.parquet - split: wemath path: longvt_sft_wemath_602.parquet - config_name: video-siah data_files: - split: test path: longvt_eval_videosiah_1280.parquet --- # LongVT-Parquet This repository contains the training data annotations and evaluation benchmark for the [LongVT](https://github.com/EvolvingLMMs-Lab/LongVT) project. ## Overview LongVT is an end-to-end agentic framework that enables "Thinking with Long Videos" via interleaved Multimodal Chain-of-Tool-Thought. This dataset provides the training annotations and evaluation benchmark in Parquet format, with source media files available in [LongVT-Source](https://huggingface.co/datasets/longvideotool/LongVT-Source). ## Important Notes For privacy reasons, media paths in the Parquet files were sanitized before release. Please replace them with your own local paths after downloading the corresponding media from [LongVT-Source](https://huggingface.co/datasets/longvideotool/LongVT-Source). The annotations and media files follow a one-to-one correspondence across the two repos. ## Dataset Structure The dataset is organized into three training subsets and one evaluation benchmark: ### Training Data | Subset | Samples | Description | |--------|---------|-------------| | `sft` | ~248K | Supervised Fine-Tuning data (with and without tool calling) | | `rl` | ~1.8K | Reinforcement Learning QA pairs | | `rft` | ~15K | Reinforcement Fine-Tuning traces | ### Evaluation Benchmark We have transferred the annotation file of VideoSIAH-Eval to [longvideotool/VideoSIAH-Eval](https://huggingface.co/datasets/longvideotool/VideoSIAH-Eval). | File | Samples | Description | Media Source | |------|---------|-------------|--------------| | `data/test-00000-of-00001.parquet` | 1,280 | VideoSIAH-Eval benchmark | `videosiaheval_*.zip` | ## SFT Data Composition | Source | Samples | Description | Media Source | |--------|---------|-------------|--------------| | `videor1` | 165K | Video-R1 COT reasoning data | `videor1_*.zip` | | `llavacot` | 54K | LLaVA COT image reasoning | `llavacot_*.zip` | | `longvideoreason` | 5.2K | Long video reasoning COT | `longvideoreason_*.zip` | | `geminicot` | 4.8K | Gemini-distilled COT | `geminicot_*.zip` | | `tvg` | 6.3K | Temporal video grounding | `tvg_*.zip` | | `longvideoreflection` | 3K | Long video reflection | `longvideoreflection_*.zip` | | `openvlthinker` | 2.8K | OpenVLThinker reasoning | `openvlthinker_*.zip` | | `wemath` | 602 | WeMath reasoning | `wemath_*.zip` | ## RL Data | Source | Samples | Description | Media Source | |--------|---------|-------------|--------------| | `selfqa` | 1.6K | Self-curated QA pairs | `selfqa_*.zip` | | `rl_val` | 114 | RL validation set | `rl_val_*.zip` | ## RFT Data | Source | Samples | Description | Media Source | |--------|---------|-------------|--------------| | `selftrace` | 15K | Self-distilled iMCoTT traces | `selftrace_*.zip` | ## Download # Install huggingface_hub pip install huggingface_hub # Download all annotation files huggingface-cli download longvideotool/LongVT-Parquet --repo-type dataset --local-dir ./data # Download source media files huggingface-cli download longvideotool/LongVT-Source --repo-type dataset --local-dir ./source## Usage with Datasets from datasets import load_dataset # Load SFT data sft_data = load_dataset("longvideotool/LongVT-Parquet", "sft", split="train") # Load RL data rl_data = load_dataset("longvideotool/LongVT-Parquet", "rl", split="train") # Load RFT data rft_data = load_dataset("longvideotool/LongVT-Parquet", "rft", split="train") ## Data Format Each sample contains: - `id`: Unique identifier - `messages`: Conversation turns with system prompt, user query, and assistant response - Includes `<think>`, `<tool_call>`, `<tool_response>`, and `<answer>` tags for reasoning traces Evaluation benchmark format: - `video_path`: Path to video file - `question`: Question about the video - `answer`: Ground truth answer ## Related Resources - 📄 **Paper**: [arXiv:2511.20785](https://arxiv.org/abs/2511.20785) - 🌐 **Project Page**: [LongVT Website](https://evolvinglmms-lab.github.io/LongVT/) - 💻 **Code**: [GitHub Repository](https://github.com/EvolvingLMMs-Lab/LongVT) - 🎬 **Source Media**: [LongVT-Source](https://huggingface.co/datasets/longvideotool/LongVT-Source) - 🤗 **Models**: [LongVT Collection](https://huggingface.co/collections/lmms-lab/longvt) ## Citation If you find LongVT useful for your research and applications, please cite using this BibTeX: ```bibtex @misc{yang2025longvtincentivizingthinkinglong, title={LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling}, author={Zuhao Yang and Sudong Wang and Kaichen Zhang and Keming Wu and Sicong Leng and Yifan Zhang and Bo Li and Chengwei Qin and Shijian Lu and Xingxuan Li and Lidong Bing}, year={2025}, eprint={2511.20785}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2511.20785}, } ``` ## License This dataset is released under the Apache 2.0 License.

提供机构：

longvideotool

5,000+

优质数据集

54 个

任务类型

进入经典数据集