enterprise-explorers/face_synthetics
收藏Hugging Face2023-03-13 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/enterprise-explorers/face_synthetics
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: image
- name: image_seg
dtype: image
- name: landmarks
dtype: string
splits:
- name: train
num_bytes: 33730885609.0
num_examples: 100000
download_size: 34096881533
dataset_size: 33730885609.0
---
# Dataset Card for `face_synthetics`
This is a copy of [Microsoft FaceSynthetics dataset](https://github.com/microsoft/FaceSynthetics), uploaded to Hugging Face Datasets for convenience.
Please, refer to the original [license](LICENSE.txt), which we replicate in this repo.
The dataset was uploaded using the following code, which assumes the original `zip` file was uncompressed to `/data/microsoft_face_synthetics`:
```Python
from datasets import Dataset
from pathlib import Path
from PIL import Image
face_synthetics = Path("/data/microsoft_face_synthetics")
def entry_for_id(entry_id):
if type(entry_id) == int:
entry_id = f"{entry_id:06}"
image = Image.open(face_synthetics/f"{entry_id}.png")
image_seg = Image.open(face_synthetics/f"{entry_id}_seg.png")
with open(face_synthetics/f"{entry_id}_ldmks.txt") as f:
landmarks = f.read()
return {
"image": image,
"image_seg": image_seg,
"landmarks": landmarks,
}
def generate_entries():
for x in range(100000):
yield entry_for_id(x)
ds = Dataset.from_generator(generate_entries)
ds.push_to_hub('pcuenq/face_synthetics')
```
Note that `image_seg`, the segmented images, appear to be black because each pixel contains a number between `0` to `18` corresponging to the different categories, see the [original README]() for details. We haven't created visualization code yet.
---
数据集信息:
特征项:
- 名称:原始图像,数据类型:图像
- 名称:分割掩码图像(image_seg),数据类型:图像
- 名称:面部关键点标注(landmarks),数据类型:字符串
数据划分:
- 名称:训练集,字节占用:33730885609.0,样本数量:100000
下载总大小:34096881533
数据集存储总大小:33730885609.0
---
# `face_synthetics` 数据集卡片
本数据集为[Microsoft FaceSynthetics数据集](https://github.com/microsoft/FaceSynthetics)的复刻版本,为便于使用已上传至Hugging Face Datasets平台。
请参阅本仓库中复刻的原始[授权协议](LICENSE.txt)。
本数据集通过以下代码完成上传,该代码假设原始压缩包已解压至`/data/microsoft_face_synthetics`路径:
Python
from datasets import Dataset
from pathlib import Path
from PIL import Image
face_synthetics = Path("/data/microsoft_face_synthetics")
def entry_for_id(entry_id):
if type(entry_id) == int:
entry_id = f"{entry_id:06}"
image = Image.open(face_synthetics/f"{entry_id}.png")
image_seg = Image.open(face_synthetics/f"{entry_id}_seg.png")
with open(face_synthetics/f"{entry_id}_ldmks.txt") as f:
landmarks = f.read()
return {
"image": image,
"image_seg": image_seg,
"landmarks": landmarks,
}
def generate_entries():
for x in range(100000):
yield entry_for_id(x)
ds = Dataset.from_generator(generate_entries)
ds.push_to_hub('pcuenq/face_synthetics')
请注意,`image_seg`(分割掩码图像)看似全黑,这是因为每个像素的值介于0至18之间,分别对应不同的类别,详细信息请参阅[原始README]()。目前尚未编写可视化代码。
提供机构:
enterprise-explorers



