five

pcuenq/face_synthetics_spiga

收藏
Hugging Face2023-03-20 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/pcuenq/face_synthetics_spiga
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: image dtype: image - name: image_seg dtype: image - name: landmarks dtype: string - name: spiga sequence: sequence: float64 - name: spiga_seg dtype: image splits: - name: train num_bytes: 31081737215.0 num_examples: 100000 download_size: 31009656222 dataset_size: 31081737215.0 --- # Dataset Card for "face_synthetics_spiga" This is a copy of [Microsoft FaceSynthetics dataset](https://github.com/microsoft/FaceSynthetics) with [SPIGA](https://github.com/andresprados/SPIGA) landmark annotations. For a copy of the original FaceSynthetics dataset with no extra annotations, please refer to [pcuenq/face_synthetics](https://huggingface.co/pcuenq/face_synthetics). Please, refer to the original [license](LICENSE.txt), which we replicate in this repo. The SPIGA annotations were created by Hugging Face Inc. and are distributed under the MIT license. This dataset was prepared using the code below. It iterates through the dataset to perform landmark detection using SPIGA, and then to create visualizations of the features. Visualization is performed using Matplotlib to render to memory buffers. ```Python import numpy as np from datasets import load_dataset from spiga.inference.config import ModelConfig from spiga.inference.framework import SPIGAFramework dataset_name = "pcuenq/face_synthetics" faces = load_dataset(dataset_name) faces = faces["train"] # ## Obtain SPIGA features processor = SPIGAFramework(ModelConfig("300wpublic")) # We obtain the bbox from the existing landmarks in the dataset. # We could use `dlib`, but this should be faster. # Note that the `landmarks` are stored as strings. def parse_landmarks(landmarks_str): landmarks = landmarks_str.strip().split('\n') landmarks = [k.split(' ') for k in landmarks] landmarks = [(float(x), float(y)) for x, y in landmarks] return landmarks def bbox_from_landmarks(landmarks_str): landmarks = parse_landmarks(landmarks_str) landmarks_x, landmarks_y = zip(*landmarks) x_min, x_max = min(landmarks_x), max(landmarks_x) y_min, y_max = min(landmarks_y), max(landmarks_y) width = x_max - x_min height = y_max - y_min # Give it a little room; I think it works anyway x_min -= 5 y_min -= 5 width += 10 height += 10 bbox = (x_min, y_min, width, height) return bbox def spiga_process(example): image = example["image"] image = np.array(image) # BGR image = image[:, :, ::-1] bbox = bbox_from_landmarks(example["landmarks"]) features = processor.inference(image, [bbox]) landmarks = features["landmarks"][0] example["spiga"] = landmarks return example # For some reason this map doesn't work with num_proc > 1 :( # TODO: run inference on GPU faces = faces.map(spiga_process) # ## "Segmentation" # We use bezier paths to draw contours and areas. import matplotlib.pyplot as plt import matplotlib.patches as patches from matplotlib.path import Path import PIL def get_patch(landmarks, color='lime', closed=False): contour = landmarks ops = [Path.MOVETO] + [Path.LINETO]*(len(contour)-1) facecolor = (0, 0, 0, 0) # Transparent fill color, if open if closed: contour.append(contour[0]) ops.append(Path.CLOSEPOLY) facecolor = color path = Path(contour, ops) return patches.PathPatch(path, facecolor=facecolor, edgecolor=color, lw=4) # Draw to a buffer. def conditioning_from_landmarks(landmarks, size=512): # Precisely control output image size dpi = 72 fig, ax = plt.subplots(1, figsize=[size/dpi, size/dpi], tight_layout={'pad':0}) fig.set_dpi(dpi) black = np.zeros((size, size, 3)) ax.imshow(black) face_patch = get_patch(landmarks[0:17]) l_eyebrow = get_patch(landmarks[17:22], color='yellow') r_eyebrow = get_patch(landmarks[22:27], color='yellow') nose_v = get_patch(landmarks[27:31], color='orange') nose_h = get_patch(landmarks[31:36], color='orange') l_eye = get_patch(landmarks[36:42], color='magenta', closed=True) r_eye = get_patch(landmarks[42:48], color='magenta', closed=True) outer_lips = get_patch(landmarks[48:60], color='cyan', closed=True) inner_lips = get_patch(landmarks[60:68], color='blue', closed=True) ax.add_patch(face_patch) ax.add_patch(l_eyebrow) ax.add_patch(r_eyebrow) ax.add_patch(nose_v) ax.add_patch(nose_h) ax.add_patch(l_eye) ax.add_patch(r_eye) ax.add_patch(outer_lips) ax.add_patch(inner_lips) plt.axis('off') fig.canvas.draw() buffer, (width, height) = fig.canvas.print_to_buffer() assert width == height assert width == size buffer = np.frombuffer(buffer, np.uint8).reshape((height, width, 4)) buffer = buffer[:, :, 0:3] plt.close(fig) return PIL.Image.fromarray(buffer) def spiga_segmentation(example): landmarks = example["spiga"] example['spiga_seg'] = conditioning_from_landmarks(landmarks) return example faces = faces.map(spiga_segmentation, num_proc=16) faces.push_to_hub(f"{dataset_name}_spiga") ```
提供机构:
pcuenq
原始信息汇总

数据集概述

数据集名称

  • 名称: face_synthetics_spiga

数据集特征

  • image: 图像数据
  • image_seg: 图像分割数据
  • landmarks: 地标数据,数据类型为字符串
  • spiga: 序列数据,数据类型为float64
  • spiga_seg: 图像分割数据

数据集分割

  • train: 训练集
    • 数量: 100000个样本
    • 大小: 31081737215.0字节

数据集大小

  • 下载大小: 31009656222字节
  • 数据集大小: 31081737215.0字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作