MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64-1.1
收藏Hugging Face2026-04-03 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64-1.1
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- text-to-image
- image-classification
- unconditional-image-generation
tags:
- minecraft
- minecraft-skins
- de-duped
- deduped
- zip-dataset
- zip
- zip-archive
size_categories:
- 1M<n<10M
license: apache-2.0
pretty_name: Minecraft Skins 1.1M Deduped (64x64 Edition) 1.1
---
# Minecraft Skins 1.1M Deduped (64x64 Edition) 1.1!
[Minecraft Skins 1.1M Deduped](https://huggingface.co/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64) but troll skins were filtered.
Format is a 2.6 GB ZIP archive containing 64x64 PNG skin files.
# Tools used
PIL Image (Python) and Google Colab (free CPU tier)
# How it was made
1. Loaded [Minecraft Skins 1.1M Deduped](https://huggingface.co/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64),
2. Troll (like single-color) skins were filtered (std > 1 were kept) and removed.
3. Result is 1106552 real 64x64 skins.
4. Output is given in a 2.6 GB ZIP archive.
This can be used to make your own skin generation model (but I'm going with VQ-VAE anyway!)
# Future improvements for version 2
1. Captioning (with Florence 2 Base)
2. ~~Filtering troll skins (skins that are formed of just a single color)~~ already done!
# Code
Code to reproduce it (all by Claude 4.6 Sonnet):
Same code as before, then:
```python
import os
import numpy as np
from PIL import Image
from tqdm import tqdm
from multiprocessing import Pool, cpu_count
import shutil
SKIN_DIR = "/content/filtered_skins"
OUTPUT_DIR = "/content/filtered_skins_plus"
STD_THRESHOLD = 1
os.makedirs(OUTPUT_DIR, exist_ok=True)
skin_files = [f for f in os.listdir(SKIN_DIR) if f.endswith(".png")]
def process_skin(filename):
path = os.path.join(SKIN_DIR, filename)
try:
img = Image.open(path).convert("RGB")
std = np.array(img).std()
if std >= STD_THRESHOLD:
shutil.copy2(path, os.path.join(OUTPUT_DIR, filename))
return "kept"
return "filtered"
except Exception:
return "error"
print(f"CPUs available: {cpu_count()}")
with Pool(processes=cpu_count()) as pool:
results = list(tqdm(
pool.imap(process_skin, skin_files, chunksize=100),
total=len(skin_files)
))
kept = results.count("kept")
filtered = results.count("filtered")
errors = results.count("error")
print(f"\nKept : {kept:,}")
print(f"Filtered : {filtered:,}")
print(f"Errors : {errors:,}")
```
then:
```python
import os
import zipfile
from tqdm import tqdm
INPUT_DIR = "/content/filtered_skins_plus"
ZIP_PATH = "/content/minecraft_skins_64x64_v1_1.zip"
skin_files = [f for f in os.listdir(INPUT_DIR) if f.endswith(".png")]
print(f"Skins to zip: {len(skin_files):,}")
with zipfile.ZipFile(ZIP_PATH, "w", zipfile.ZIP_DEFLATED, compresslevel=1) as zf:
for filename in tqdm(skin_files):
zf.write(os.path.join(INPUT_DIR, filename), arcname=filename)
size_mb = os.path.getsize(ZIP_PATH) / 1024 / 1024
print(f"\nDone!")
print(f"Skins zipped : {len(skin_files):,}")
print(f"ZIP size : {size_mb:.1f} MB")
```
任务类别:
- 文本到图像生成
- 图像分类
- 无条件图像生成
标签:
- Minecraft
- Minecraft皮肤
- 去重(de-duped)
- 去重(deduped)
- ZIP数据集
- ZIP
- ZIP归档(ZIP archive)
规模类别:
- 100万<样本数<1000万
许可证:Apache-2.0
友好名称:Minecraft皮肤110万去重版(64×64分辨率)1.1
# Minecraft皮肤110万去重版(64×64分辨率)1.1!
本数据集基于[Minecraft皮肤110万去重版](https://huggingface.co/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64),但已过滤恶搞皮肤。
数据集格式为2.6GB的ZIP归档文件,内含64×64分辨率的PNG格式皮肤文件。
# 所用工具
Python图像库(PIL Image)及谷歌Colab(Google Colab)免费CPU算力版本。
# 数据集制作流程
1. 加载[Minecraft皮肤110万去重版](https://huggingface.co/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64)数据集;
2. 过滤恶搞皮肤(如单色皮肤),保留标准差(standard deviation,简称std)大于1的样本并移除其余内容;
3. 最终得到1,106,552张有效64×64分辨率皮肤;
4. 最终输出为2.6GB的ZIP归档文件。
本数据集可用于训练自定义皮肤生成模型(本人后续将采用矢量量化变分自编码器(VQ-VAE)模型进行开发)。
# 版本2的未来优化方向
1. 为皮肤添加文本标注(采用Florence 2基础版(Florence 2 Base)模型);
2. ~~过滤恶搞皮肤(即仅由单一颜色构成的皮肤)~~ 已完成!
# 复现代码
用于复现该数据集的代码(全部由Claude 4.6 Sonnet生成):
沿用此前代码,首先运行以下Python代码:
python
import os
import numpy as np
from PIL import Image
from tqdm import tqdm
from multiprocessing import Pool, cpu_count
import shutil
SKIN_DIR = "/content/filtered_skins"
OUTPUT_DIR = "/content/filtered_skins_plus"
STD_THRESHOLD = 1
os.makedirs(OUTPUT_DIR, exist_ok=True)
skin_files = [f for f in os.listdir(SKIN_DIR) if f.endswith(".png")]
def process_skin(filename):
path = os.path.join(SKIN_DIR, filename)
try:
img = Image.open(path).convert("RGB")
std = np.array(img).std()
if std >= STD_THRESHOLD:
shutil.copy2(path, os.path.join(OUTPUT_DIR, filename))
return "kept"
return "filtered"
except Exception:
return "error"
print(f"CPUs available: {cpu_count()}")
with Pool(processes=cpu_count()) as pool:
results = list(tqdm(
pool.imap(process_skin, skin_files, chunksize=100),
total=len(skin_files)
))
kept = results.count("kept")
filtered = results.count("filtered")
errors = results.count("error")
print(f"
Kept : {kept:,}")
print(f"Filtered : {filtered:,}")
print(f"Errors : {errors:,}")
随后运行以下代码以将处理后的皮肤打包为ZIP归档文件:
python
import os
import zipfile
from tqdm import tqdm
INPUT_DIR = "/content/filtered_skins_plus"
ZIP_PATH = "/content/minecraft_skins_64x64_v1_1.zip"
skin_files = [f for f in os.listdir(INPUT_DIR) if f.endswith(".png")]
print(f"Skins to zip: {len(skin_files):,}")
with zipfile.ZipFile(ZIP_PATH, "w", zipfile.ZIP_DEFLATED, compresslevel=1) as zf:
for filename in tqdm(skin_files):
zf.write(os.path.join(INPUT_DIR, filename), arcname=filename)
size_mb = os.path.getsize(ZIP_PATH) / 1024 / 1024
print(f"
Done!")
print(f"Skins zipped : {len(skin_files):,}")
print(f"ZIP size : {size_mb:.1f} MB")
提供机构:
MihaiPopa-1



