THVD (Talking Head Video Dataset)
收藏DataCite Commons2025-04-02 更新2025-04-16 收录
下载链接:
https://data.mendeley.com/datasets/ykhw8r7bfx/1
下载链接
链接失效反馈官方服务:
资源简介:
**About**
We provide a comprehensive talking-head video dataset with over 50,000 videos, totaling more than 500 hours of footage and featuring 23,841 unique identities from around the world.
**Distribution**
Detailing the format, size, and structure of the dataset:
Data Volume:
-Total Size: 2.5TB
-Total Videos: 47,200
-Identities Covered: 23,000
-Resolution: 60% 4k(1980), 33% fullHD(1080)
-Formats: MP4
-Full-length videos with visible mouth movements in every frame.
-Minimum face size of 400 pixels.
-Video durations range from 20 seconds to 5 minutes.
-Faces have not been cut out, full screen videos including backgrounds.
**Usage**
**This dataset is ideal for a variety of applications**:
Face Recognition & Verification: Training and benchmarking facial recognition models.
Action Recognition: Identifying human activities and behaviors.
Re-Identification (Re-ID): Tracking identities across different videos and environments.
Deepfake Detection: Developing methods to detect manipulated videos.
Generative AI: Training high-resolution video generation models.
Lip Syncing Applications: Enhancing AI-driven lip-syncing models for dubbing and virtual avatars.
Background AI Applications: Developing AI models for automated background replacement, segmentation, and enhancement.
**Coverage**
Explaining the scope and coverage of the dataset:
Geographic Coverage: Worldwide
Time Range: Time range and size of the videos have been noted in the CSV file.
Demographics: Includes information about age, gender, ethnicity, format, resolution, and file size.
**Languages Covered (Videos):**
English: 23,038 videos
Portuguese: 1,346 videos
Spanish: 677 videos
Norwegian: 1,266 videos
Swedish: 1,056 videos
Korean: 848 videos
Polish: 1,807 videos
Indonesian: 1,163 videos
French: 1,102 videos
German: 1,276 videos
Japanese: 1,433 videos
Dutch: 1,666 videos
Indian: 1,163 videos
Czech: 590 videos
Chinese: 685 videos
Italian: 975 videos
**Who Can Use It**
List examples of intended users and their use cases:
Data Scientists: Training machine learning models for video-based AI applications.
Researchers: Studying human behavior, facial analysis, or video AI advancements.
Businesses: Developing facial recognition systems, video analytics, or AI-driven media applications.
**Additional Notes**
Ensure ethical usage and compliance with privacy regulations.
The dataset’s quality and scale make it valuable for high-performance AI training.
Potential preprocessing (cropping, down sampling) may be needed for different use cases.
Dataset has not been completed yet and expands daily, please contact for most up to date CSV file.
The dataset has been divided into 100GB zipped files and is hosted on a private server (with the option to upload to the cloud if needed).
To verify the dataset's quality, please contact me for the full CSV file. I’d be happy to provide example videos selected by the potential buyer.
**数据集概况**
本团队提供一款综合性人头讲话视频数据集,包含超5万条视频,总时长逾500小时,涵盖来自全球的23841个独特身份个体。
**数据集规格**
本部分详细说明数据集的格式、体量与结构:
- 总容量:2.5TB
- 视频总数:47200条
- 覆盖身份数:23000个
- 分辨率构成:60%为4K规格(标注为1980),33%为全高清(Full HD,1080P)
- 视频格式:MP4
- 所有视频均为完整时长,每一帧均可见唇部动作
- 人脸最小像素尺寸为400像素
- 单条视频时长区间为20秒至5分钟
- 未对人脸进行裁切,为包含背景的全屏视频
**应用场景**
本数据集适用于诸多下游任务:
- 人脸识别与验证:用于训练及评测人脸识别模型
- 动作识别:用于识别人类活动与行为模式
- 身份重识别(Re-ID):用于追踪不同视频与场景中的个体身份
- 深度伪造检测:用于研发伪造视频的检测方法
- 生成式AI(Generative AI):用于训练高分辨率视频生成模型
- 唇形同步应用:用于优化面向配音与虚拟化身的AI驱动唇形同步模型
- 背景AI应用:用于研发自动化背景替换、分割与增强的AI模型
**覆盖范围**
本部分说明数据集的覆盖范畴:
- 地理覆盖:全球范围
- 时间范围:视频的时间区间与体量已在CSV文件中注明
- 人口统计学信息:包含年龄、性别、种族、视频格式、分辨率与文件大小等维度的信息
**视频覆盖语言**
- 英语:23038条视频
- 葡萄牙语:1346条视频
- 西班牙语:677条视频
- 挪威语:1266条视频
- 瑞典语:1056条视频
- 韩语:848条视频
- 波兰语:1807条视频
- 印尼语:1163条视频
- 法语:1102条视频
- 德语:1276条视频
- 日语:1433条视频
- 荷兰语:1666条视频
- 印度语:1163条视频
- 捷克语:590条视频
- 汉语:685条视频
- 意大利语:975条视频
**适用用户群体**
以下为典型用户群体及其应用场景:
- 数据科学家:用于训练面向视频类AI应用的机器学习模型
- 研究人员:用于开展人类行为、面部分析或视频AI技术演进相关研究
- 企业:用于研发人脸识别系统、视频分析或AI驱动的媒体应用
**附加说明**
1. 请确保合规使用数据集,并遵守相关隐私法规
2. 本数据集的质量与规模可为高性能AI训练提供高价值支撑
3. 针对不同应用场景,可能需要对数据进行裁剪、下采样等预处理操作
4. 本数据集尚未完全定稿,每日均有新增内容,如需获取最新版CSV文件,请联系我方
5. 本数据集已拆分为100GB的压缩包文件,托管于私有服务器(如有需要,也可提供云端上传服务)
6. 如需验证数据集质量,请联系我方获取完整CSV文件,我方将乐于为潜在采购方提供精选示例视频
提供机构:
Mendeley Data
创建时间:
2025-04-02
搜集汇总
数据集介绍

背景与挑战
背景概述
THVD是一个大规模说话头部视频数据集,包含超过47,000个视频(总大小2.5TB),涵盖23,000个独特身份,视频分辨率高(主要为4K和全高清),时长从20秒到5分钟,每帧均显示清晰的嘴部运动和完整背景。该数据集具有全球覆盖和多种语言支持,适用于人脸识别、动作分析、深度伪造检测、生成式AI等多种人工智能应用场景,为高性能AI训练提供了高质量的素材。
以上内容由遇见数据集搜集并总结生成



