five

FaceCaptionHQ-4M

收藏
魔搭社区2026-01-02 更新2025-04-12 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/FaceCaptionHQ-4M
下载链接
链接失效反馈
官方服务:
资源简介:
# FaceCaptionHQ-4M You need to first download the data from here and then apply for access to the original Laion-face dataset by completing the required agreement (github). Once approved, refer to the information available on HuggingFace to obtain the corresponding image-text pairs. **[25/06/09] 🤗The Original Images, are Released [Completing the Agreement](https://github.com/ddw2AIGROUP2CQUPT/Large-Scale-Multimodal-Face-Datasets)** **FaceCaptionHQ-4M contains about 4M facial image-text pairs that cleaned from FaceCaption-15M .** <!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/663f06e01cd68975883a353e/FJTa1MFzIMmqzDWWK9qtm.png) --> ![image/png](https://cdn-uploads.huggingface.co/production/uploads/663f06e01cd68975883a353e/hf2UTEgz9q0cx7p2S5cCQ.png) ## Figure.1 Illustrations ![image/png](https://cdn-uploads.huggingface.co/production/uploads/663f06e01cd68975883a353e/REuHu-x9v-yra4ccR-nbZ.png) ## Figure.2 Piplines of constructing **FaceCaptionHQ-4M**. The detailed method can be referred to [Face-MakeUp](https://arxiv.org/abs/2501.02523). ## News and Update 🔥🔥🔥 * Jan.11, 2025. **🤗[FaceCaptionHQ-4M](https://huggingface.co/datasets/OpenFace-CQUPT/FaceCaptionHQ-4M), is released!👏👏👏** * Jan.11, 2025. **🤗[FaceMaker-V0](https://huggingface.co/OpenFace-CQUPT/Face-MakeUp), is released!👏👏👏** ## 🤗 How to Use We provide a few lines of code to download the text part, and the image part requires an additional download. ``` python from datasets import load_dataset ds = load_dataset("OpenFace-CQUPT/FaceCaptionHQ-4M") ``` # Additional Information ## Licensing Information The FaceCaptionHQ-4M dataset is released by OpenFaceCQUPT and is intended exclusively for research and educational purposes. It has been generated using publicly available models such as Qwen. Users should be aware that this data may contain inaccuracies, unsafe content, or biases, and should carefully evaluate its accuracy and suitability prior to use. OpenFaceCQUPT and its licensors provide this dataset "AS-IS," without any warranties, express or implied. The views and opinions expressed in the dataset do not necessarily reflect those of OpenFaceCQUPT. The FaceCaptionHQ-4M dataset is licensed under the Creative Commons Attribution 4.0 International License (CC-BY 4.0). The availability of this dataset does not constitute an invitation to use any of the information for any illegal or unlawful purposes, or beyond the scope of research or educational purposes.It is crucial to ensure ethical and responsible use of this dataset to prevent privacy violations and other ethical concerns. ## Citation ``` @misc{dai2025facemakeupmultimodalfacialprompts, title={Face-MakeUp: Multimodal Facial Prompts for Text-to-Image Generation}, author={Dawei Dai and Mingming Jia and Yinxiu Zhou and Hang Xing and Chenghang Li}, year={2025}, eprint={2501.02523}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2501.02523}, } @misc{dai202415mmultimodalfacialimagetext, title={15M Multimodal Facial Image-Text Dataset}, author={Dawei Dai and YuTang Li and YingGe Liu and Mingming Jia and Zhang YuanHui and Guoyin Wang}, year={2024}, eprint={2407.08515}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2407.08515}, } ``` ## contact mailto: [S230231046@stu.cqupt.edu.cn](mailto:S230231046@stu.cqupt.edu.cn) or [dw_dai@163.com](mailto:dw_dai@163.com)

# FaceCaptionHQ-4M 数据集 请先从指定渠道下载数据,并通过签署对应协议(GitHub平台)申请获取原始Laion-face数据集的访问权限。审批通过后,请参考HuggingFace平台上的公开信息获取对应的图像-文本配对数据。 **[2025/06/09] 🤗 原始图像数据集已发布 [签署协议](https://github.com/ddw2AIGROUP2CQUPT/Large-Scale-Multimodal-Face-Datasets)** **FaceCaptionHQ-4M 数据集包含约400万条人脸图像-文本配对数据,其源自FaceCaption-15M数据集并经过清洗优化。** ![image/png](https://cdn-uploads.huggingface.co/production/uploads/663f06e01cd68975883a353e/hf2UTEgz9q0cx7p2S5cCQ.png) ## 图1 示例图示 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/663f06e01cd68975883a353e/REuHu-x9v-yra4ccR-nbZ.png) ## 图2 FaceCaptionHQ-4M 数据集的构建流程。具体构建方法可参考 [Face-MakeUp](https://arxiv.org/abs/2501.02523) 论文。 ## 更新动态 🔥🔥🔥 * 2025年1月11日 **🤗[FaceCaptionHQ-4M](https://huggingface.co/datasets/OpenFace-CQUPT/FaceCaptionHQ-4M) 数据集正式发布!👏👏👏** * 2025年1月11日 **🤗[FaceMaker-V0](https://huggingface.co/OpenFace-CQUPT/Face-MakeUp) 模型正式发布!👏👏👏** ## 🤗 使用方法 我们提供了数行代码用于下载文本数据,图像数据则需额外单独下载。 python from datasets import load_dataset ds = load_dataset("OpenFace-CQUPT/FaceCaptionHQ-4M") # 附加信息 ## 授权许可信息 FaceCaptionHQ-4M 数据集由 OpenFaceCQUPT 发布,仅用于科研与教育用途。该数据集基于通义千问(Qwen)等公开模型生成。使用者需注意,本数据集可能存在不准确内容、不安全信息或偏见,使用前应仔细评估其准确性与适用性。OpenFaceCQUPT 及其授权方按“现状”提供本数据集,不提供任何明示或暗示的担保。数据集中表达的观点未必代表 OpenFaceCQUPT 的立场。 FaceCaptionHQ-4M 数据集采用知识共享署名4.0国际许可协议(CC-BY 4.0)进行授权。本数据集的公开并不代表授权使用者将其中信息用于任何非法用途,或超出科研、教育范畴之外。使用者需以符合伦理与负责任的态度使用本数据集,避免侵犯隐私及其他伦理问题。 ## 引用格式 @misc{dai2025facemakeupmultimodalfacialprompts, title={Face-MakeUp: Multimodal Facial Prompts for Text-to-Image Generation}, author={Dawei Dai and Mingming Jia and Yinxiu Zhou and Hang Xing and Chenghang Li}, year={2025}, eprint={2501.02523}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2501.02523}, } @misc{dai202415mmultimodalfacialimagetext, title={15M Multimodal Facial Image-Text Dataset}, author={Dawei Dai and YuTang Li and YingGe Liu and Mingming Jia and Zhang YuanHui and Guoyin Wang}, year={2024}, eprint={2407.08515}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2407.08515}, } ## 联系方式 可通过以下邮箱联系:[S230231046@stu.cqupt.edu.cn](mailto:S230231046@stu.cqupt.edu.cn) 或 [dw_dai@163.com](mailto:dw_dai@163.com)
提供机构:
maas
创建时间:
2025-04-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作