THVD (Talking Head Video Dataset)

Name: THVD (Talking Head Video Dataset)
Creator: Mendeley Data
Published: 2025-05-01 05:04:45
License: 暂无描述

DataCite Commons2025-05-01 更新2025-04-16 收录

下载链接：

https://data.mendeley.com/datasets/ykhw8r7bfx

下载链接

链接失效反馈

官方服务：

资源简介：

**About** We provide a comprehensive talking-head video dataset with over 50,000 videos, totaling more than 500 hours of footage and featuring 20,841 unique identities from around the world. **Distribution** Detailing the format, size, and structure of the dataset: Data Volume: -Total Size: 2.7TB -Total Videos: 47,547 -Identities Covered: 20,841 -Resolution: 60% 4k(1980), 33% fullHD(1080) -Formats: MP4 -Full-length videos with visible mouth movements in every frame. -Minimum face size of 400 pixels. -Video durations range from 20 seconds to 5 minutes. -Faces have not been cut out, full screen videos including backgrounds. **Usage** **This dataset is ideal for a variety of applications**: Face Recognition & Verification: Training and benchmarking facial recognition models. Action Recognition: Identifying human activities and behaviors. Re-Identification (Re-ID): Tracking identities across different videos and environments. Deepfake Detection: Developing methods to detect manipulated videos. Generative AI: Training high-resolution video generation models. Lip Syncing Applications: Enhancing AI-driven lip-syncing models for dubbing and virtual avatars. Background AI Applications: Developing AI models for automated background replacement, segmentation, and enhancement. **Coverage** Explaining the scope and coverage of the dataset: Geographic Coverage: Worldwide Time Range: Time range and size of the videos have been noted in the CSV file. Demographics: Includes information about age, gender, ethnicity, format, resolution, and file size. **Languages Covered (Videos):** English: 23,038 videos Portuguese: 1,346 videos Spanish: 677 videos Norwegian: 1,266 videos Swedish: 1,056 videos Korean: 848 videos Polish: 1,807 videos Indonesian: 1,163 videos French: 1,102 videos German: 1,276 videos Japanese: 1,433 videos Dutch: 1,666 videos Indian: 1,163 videos Czech: 590 videos Chinese: 685 videos Italian: 975 videos Philipeans: 920 videos Bulgaria: 340 videos Romanian: 1144 videos Arabic: 1691 videos **Who Can Use It** List examples of intended users and their use cases: Data Scientists: Training machine learning models for video-based AI applications. Researchers: Studying human behavior, facial analysis, or video AI advancements. Businesses: Developing facial recognition systems, video analytics, or AI-driven media applications. **Additional Notes** Ensure ethical usage and compliance with privacy regulations. The dataset’s quality and scale make it valuable for high-performance AI training. Potential preprocessing (cropping, down sampling) may be needed for different use cases. Dataset has not been completed yet and expands daily, please contact for most up to date CSV file. The dataset has been divided into 100GB zipped files and is hosted on a private server (with the option to upload to the cloud if needed). To verify the dataset's quality, please contact me for the full CSV file.

**数据集概况** 我们推出了一款覆盖全面的说话人头视频数据集，总视频数超5万段，总时长逾500小时，涵盖来自全球的20841名独特个体。 **数据集分布** 以下详细说明本数据集的格式、体量与结构：数据体量： - 总容量：2.7TB - 总视频数：47547段 - 覆盖身份数：20841个 - 分辨率：60%为4K（1980），33%为全高清（1080） - 视频格式：MP4 - 所有视频均为完整时长，每一帧均可见唇部动作 - 人脸最小尺寸为400像素 - 视频时长范围为20秒至5分钟 - 未对人脸进行裁切，为包含背景的全屏视频 **应用场景** 本数据集适用于多种场景： - 人脸识别与验证：用于训练及评估人脸识别模型 - 动作识别：用于识别人类活动与行为模式 - 跨身份重识别（Re-ID）：用于在不同视频与环境中追踪个体身份 - 深度伪造检测：用于研发伪造视频检测方法 - 生成式AI（Generative AI）：用于训练高分辨率视频生成模型 - 唇形同步应用：用于优化AI驱动的唇形同步模型，以适配配音及虚拟形象场景 - 背景AI应用：用于研发可实现自动背景替换、分割与增强的AI模型 **覆盖范围** 以下说明本数据集的覆盖范畴与边界： - 地理覆盖范围：全球范围 - 时间范围：视频的时间范围与体量已在CSV文件中注明 - 人口统计信息：包含年龄、性别、族裔、视频格式、分辨率及文件大小等信息 **视频覆盖语言** - 英语：23038段 - 葡萄牙语：1346段 - 西班牙语：677段 - 挪威语：1266段 - 瑞典语：1056段 - 韩语：848段 - 波兰语：1807段 - 印度尼西亚语：1163段 - 法语：1102段 - 德语：1276段 - 日语：1433段 - 荷兰语：1666段 - 印度语：1163段 - 捷克语：590段 - 中文：685段 - 意大利语：975段 - 菲律宾语：920段 - 保加利亚语：340段 - 罗马尼亚语：1144段 - 阿拉伯语：1691段 **适用人群** 以下列举本数据集的适用对象及其典型应用场景： - 数据科学家：用于训练面向视频类AI应用的机器学习模型 - 研究人员：用于开展人类行为、面部分析或视频AI技术演进相关研究 - 企业：用于研发人脸识别系统、视频分析工具或AI驱动的媒体应用 **补充说明** 1. 请确保以伦理方式使用本数据集，并遵守相关隐私法规 2. 本数据集的质量与规模使其可有效支撑高性能AI训练任务 3. 针对不同应用场景，可能需要对数据进行预处理（如裁切、下采样） 4. 本数据集尚未完全构建完成，每日均会更新扩充，如需获取最新版CSV文件，请联系我方 5. 本数据集已拆分为100GB大小的压缩包，托管于私有服务器（如需也可上传至云端） 6. 如需验证数据集质量，请联系我方获取完整CSV文件

提供机构：

Mendeley Data

创建时间：

2025-04-02

搜集汇总

数据集介绍