five

MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64-1.1

收藏
Hugging Face2026-04-03 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64-1.1
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - text-to-image - image-classification - unconditional-image-generation tags: - minecraft - minecraft-skins - de-duped - deduped - zip-dataset - zip - zip-archive size_categories: - 1M<n<10M license: apache-2.0 pretty_name: Minecraft Skins 1.1M Deduped (64x64 Edition) 1.1 --- # Minecraft Skins 1.1M Deduped (64x64 Edition) 1.1! [Minecraft Skins 1.1M Deduped](https://huggingface.co/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64) but troll skins were filtered. Format is a 2.6 GB ZIP archive containing 64x64 PNG skin files. # Tools used PIL Image (Python) and Google Colab (free CPU tier) # How it was made 1. Loaded [Minecraft Skins 1.1M Deduped](https://huggingface.co/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64), 2. Troll (like single-color) skins were filtered (std > 1 were kept) and removed. 3. Result is 1106552 real 64x64 skins. 4. Output is given in a 2.6 GB ZIP archive. This can be used to make your own skin generation model (but I'm going with VQ-VAE anyway!) # Future improvements for version 2 1. Captioning (with Florence 2 Base) 2. ~~Filtering troll skins (skins that are formed of just a single color)~~ already done! # Code Code to reproduce it (all by Claude 4.6 Sonnet): Same code as before, then: ```python import os import numpy as np from PIL import Image from tqdm import tqdm from multiprocessing import Pool, cpu_count import shutil SKIN_DIR = "/content/filtered_skins" OUTPUT_DIR = "/content/filtered_skins_plus" STD_THRESHOLD = 1 os.makedirs(OUTPUT_DIR, exist_ok=True) skin_files = [f for f in os.listdir(SKIN_DIR) if f.endswith(".png")] def process_skin(filename): path = os.path.join(SKIN_DIR, filename) try: img = Image.open(path).convert("RGB") std = np.array(img).std() if std >= STD_THRESHOLD: shutil.copy2(path, os.path.join(OUTPUT_DIR, filename)) return "kept" return "filtered" except Exception: return "error" print(f"CPUs available: {cpu_count()}") with Pool(processes=cpu_count()) as pool: results = list(tqdm( pool.imap(process_skin, skin_files, chunksize=100), total=len(skin_files) )) kept = results.count("kept") filtered = results.count("filtered") errors = results.count("error") print(f"\nKept : {kept:,}") print(f"Filtered : {filtered:,}") print(f"Errors : {errors:,}") ``` then: ```python import os import zipfile from tqdm import tqdm INPUT_DIR = "/content/filtered_skins_plus" ZIP_PATH = "/content/minecraft_skins_64x64_v1_1.zip" skin_files = [f for f in os.listdir(INPUT_DIR) if f.endswith(".png")] print(f"Skins to zip: {len(skin_files):,}") with zipfile.ZipFile(ZIP_PATH, "w", zipfile.ZIP_DEFLATED, compresslevel=1) as zf: for filename in tqdm(skin_files): zf.write(os.path.join(INPUT_DIR, filename), arcname=filename) size_mb = os.path.getsize(ZIP_PATH) / 1024 / 1024 print(f"\nDone!") print(f"Skins zipped : {len(skin_files):,}") print(f"ZIP size : {size_mb:.1f} MB") ```

任务类别: - 文本到图像生成 - 图像分类 - 无条件图像生成 标签: - Minecraft - Minecraft皮肤 - 去重(de-duped) - 去重(deduped) - ZIP数据集 - ZIP - ZIP归档(ZIP archive) 规模类别: - 100万<样本数<1000万 许可证:Apache-2.0 友好名称:Minecraft皮肤110万去重版(64×64分辨率)1.1 # Minecraft皮肤110万去重版(64×64分辨率)1.1! 本数据集基于[Minecraft皮肤110万去重版](https://huggingface.co/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64),但已过滤恶搞皮肤。 数据集格式为2.6GB的ZIP归档文件,内含64×64分辨率的PNG格式皮肤文件。 # 所用工具 Python图像库(PIL Image)及谷歌Colab(Google Colab)免费CPU算力版本。 # 数据集制作流程 1. 加载[Minecraft皮肤110万去重版](https://huggingface.co/datasets/MihaiPopa-1/minecraft-skins-1.1m-deduped-64x64)数据集; 2. 过滤恶搞皮肤(如单色皮肤),保留标准差(standard deviation,简称std)大于1的样本并移除其余内容; 3. 最终得到1,106,552张有效64×64分辨率皮肤; 4. 最终输出为2.6GB的ZIP归档文件。 本数据集可用于训练自定义皮肤生成模型(本人后续将采用矢量量化变分自编码器(VQ-VAE)模型进行开发)。 # 版本2的未来优化方向 1. 为皮肤添加文本标注(采用Florence 2基础版(Florence 2 Base)模型); 2. ~~过滤恶搞皮肤(即仅由单一颜色构成的皮肤)~~ 已完成! # 复现代码 用于复现该数据集的代码(全部由Claude 4.6 Sonnet生成): 沿用此前代码,首先运行以下Python代码: python import os import numpy as np from PIL import Image from tqdm import tqdm from multiprocessing import Pool, cpu_count import shutil SKIN_DIR = "/content/filtered_skins" OUTPUT_DIR = "/content/filtered_skins_plus" STD_THRESHOLD = 1 os.makedirs(OUTPUT_DIR, exist_ok=True) skin_files = [f for f in os.listdir(SKIN_DIR) if f.endswith(".png")] def process_skin(filename): path = os.path.join(SKIN_DIR, filename) try: img = Image.open(path).convert("RGB") std = np.array(img).std() if std >= STD_THRESHOLD: shutil.copy2(path, os.path.join(OUTPUT_DIR, filename)) return "kept" return "filtered" except Exception: return "error" print(f"CPUs available: {cpu_count()}") with Pool(processes=cpu_count()) as pool: results = list(tqdm( pool.imap(process_skin, skin_files, chunksize=100), total=len(skin_files) )) kept = results.count("kept") filtered = results.count("filtered") errors = results.count("error") print(f" Kept : {kept:,}") print(f"Filtered : {filtered:,}") print(f"Errors : {errors:,}") 随后运行以下代码以将处理后的皮肤打包为ZIP归档文件: python import os import zipfile from tqdm import tqdm INPUT_DIR = "/content/filtered_skins_plus" ZIP_PATH = "/content/minecraft_skins_64x64_v1_1.zip" skin_files = [f for f in os.listdir(INPUT_DIR) if f.endswith(".png")] print(f"Skins to zip: {len(skin_files):,}") with zipfile.ZipFile(ZIP_PATH, "w", zipfile.ZIP_DEFLATED, compresslevel=1) as zf: for filename in tqdm(skin_files): zf.write(os.path.join(INPUT_DIR, filename), arcname=filename) size_mb = os.path.getsize(ZIP_PATH) / 1024 / 1024 print(f" Done!") print(f"Skins zipped : {len(skin_files):,}") print(f"ZIP size : {size_mb:.1f} MB")
提供机构:
MihaiPopa-1
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作