NTQAI/MSVD-Video-Captioning-Vi
收藏Hugging Face2026-01-27 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/NTQAI/MSVD-Video-Captioning-Vi
下载链接
链接失效反馈官方服务:
资源简介:
MSVD-Video-Captioning-Vi是一个越南语视频字幕数据集,源自英文的MSVD数据集。该数据集提供了短视频片段的越南语字幕,适用于视频字幕研究、视觉-语言模型训练、多模态指令调优和视频到文本生成。越南语字幕是通过使用OpenAI API(GPT-5.2)自动翻译和优化原始英文字幕生成的,并经过后处理以提高流畅性。数据集结构包括视频文件、训练数据(parquet格式)和README文件。每个记录包含视频路径、提示词、字幕和原始YouTube URL等字段。
MSVD-Video-Captioning-Vi is a Vietnamese video captioning dataset derived from the English MSVD dataset. This dataset provides Vietnamese captions for short video clips and is intended for video captioning research, vision–language model training, multimodal instruction tuning, and video-to-text generation. The Vietnamese captions were generated by translating the original English captions using the OpenAI API (GPT-5.2) with refinement and optional post-processing for improved fluency. The dataset structure includes video files, training data (in parquet format), and a README file. Each record contains fields such as video path, prompts, captions, and the original YouTube URL.
提供机构:
NTQAI



