five

YuhoLiang/facesyntheticsspigacaptioned

收藏
Hugging Face2026-03-22 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/YuhoLiang/facesyntheticsspigacaptioned
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: image dtype: image - name: image_seg dtype: image - name: landmarks dtype: string - name: spiga sequence: sequence: float64 - name: spiga_seg dtype: image - name: image_caption dtype: string splits: - name: train num_bytes: 31087489990.0 num_examples: 100000 download_size: 31011261945 dataset_size: 31087489990.0 --- # Dataset Card for "face_synthetics_spiga_captioned" This is a copy of the [Microsoft FaceSynthetics dataset with SPIGA-calculated landmark annotations](https://huggingface.co/datasets/pcuenq/face_synthetics_spiga), and additional BLIP-generated captions. For a copy of the original FaceSynthetics dataset with no extra annotations, please refer to [pcuenq/face_synthetics](https://huggingface.co/datasets/pcuenq/face_synthetics). Here is the code for parsing the dataset and generating the BLIP captions: ```py from transformers import pipeline dataset_name = "pcuenq/face_synthetics_spiga" faces = load_dataset(dataset_name) faces = faces["train"] captioner = pipeline("image-to-text",model="Salesforce/blip-image-captioning-large", device=0) def caption_image_data(example): image = example["image"] image_caption = captioner(image)[0]['generated_text'] example['image_caption'] = image_caption return example faces_proc = faces.map(caption_image_data) faces_proc.push_to_hub(f"multimodalart/face_synthetics_spiga_captioned") ```

--- dataset_info: 数据集信息 features: - name: image dtype: 图像 - name: image_seg dtype: 图像分割掩码 - name: landmarks dtype: 字符串 - name: spiga sequence: sequence: float64 - name: spiga_seg dtype: 图像分割掩码 - name: image_caption dtype: 字符串 splits: - name: train num_bytes: 31087489990.0 num_examples: 100000 download_size: 31011261945 dataset_size: 31087489990.0 --- # 「face_synthetics_spiga_captioned」数据集卡片 本数据集为[Microsoft FaceSynthetics数据集(附带SPIGA计算得到的面部关键点(landmarks)标注)](https://huggingface.co/datasets/pcuenq/face_synthetics_spiga)的复刻版本,额外添加了由BLIP生成的图像标题。如需获取无额外标注的原始FaceSynthetics数据集,请参阅[pcuenq/face_synthetics](https://huggingface.co/datasets/pcuenq/face_synthetics)。 以下为解析该数据集并生成BLIP图像标题的代码示例: py from transformers import pipeline dataset_name = "pcuenq/face_synthetics_spiga" faces = load_dataset(dataset_name) faces = faces["train"] captioner = pipeline("image-to-text",model="Salesforce/blip-image-captioning-large", device=0) def caption_image_data(example): image = example["image"] image_caption = captioner(image)[0]['generated_text'] example['image_caption'] = image_caption return example faces_proc = faces.map(caption_image_data) faces_proc.push_to_hub(f"multimodalart/face_synthetics_spiga_captioned")
提供机构:
YuhoLiang
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作