OpenFace-CQUPT/HumanCaption-10M
收藏Hugging Face2025-06-09 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/OpenFace-CQUPT/HumanCaption-10M
下载链接
链接失效反馈官方服务:
资源简介:
HumanCaption-10M是一个大型、多样化、高质量的数据集,包含约1000万张与人类相关的图像及其自然语言描述。该数据集旨在促进以人为中心的任务研究,是FaceCaption-15M的第二代版本。数据集还包含相应的面部特征描述,并用于训练领域特定的大规模语言视觉模型HumanVLM。
HumanCaption-10M is a large, diverse, high-quality dataset of human-related images with natural language descriptions (image to text). The dataset contains approximately 10 million human-related images and their corresponding facial features in natural language descriptions and is designed to facilitate research on human-centered tasks. It is the second generation version of FaceCaption-15M. HumanCaption-10M is used to train a domain-specific large-scale language vision model, HumanVLM, aiming to build a unified multimodal language vision model for human-scene tasks.
提供机构:
OpenFace-CQUPT



