CyberHarem/kurata_mashiro_bangdream
收藏Hugging Face2024-01-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/CyberHarem/kurata_mashiro_bangdream
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-to-image
tags:
- art
- not-for-all-audiences
size_categories:
- n<1K
---
# Dataset of kurata_mashiro/倉田ましろ (BanG Dream!)
This is the dataset of kurata_mashiro/倉田ましろ (BanG Dream!), containing 230 images and their tags.
The core tags of this character are `bangs, blue_eyes, hair_between_eyes, short_hair, breasts, white_hair`, which are pruned in this dataset.
Images are crawled from many sites (e.g. danbooru, pixiv, zerochan ...), the auto-crawling system is powered by [DeepGHS Team](https://github.com/deepghs)([huggingface organization](https://huggingface.co/deepghs)).
## List of Packages
| Name | Images | Size | Download | Type | Description |
|:-----------------|---------:|:-----------|:--------------------------------------------------------------------------------------------------------------------------|:-----------|:---------------------------------------------------------------------|
| raw | 230 | 344.30 MiB | [Download](https://huggingface.co/datasets/CyberHarem/kurata_mashiro_bangdream/resolve/main/dataset-raw.zip) | Waifuc-Raw | Raw data with meta information (min edge aligned to 1400 if larger). |
| 800 | 230 | 185.56 MiB | [Download](https://huggingface.co/datasets/CyberHarem/kurata_mashiro_bangdream/resolve/main/dataset-800.zip) | IMG+TXT | dataset with the shorter side not exceeding 800 pixels. |
| stage3-p480-800 | 564 | 404.90 MiB | [Download](https://huggingface.co/datasets/CyberHarem/kurata_mashiro_bangdream/resolve/main/dataset-stage3-p480-800.zip) | IMG+TXT | 3-stage cropped dataset with the area not less than 480x480 pixels. |
| 1200 | 230 | 298.91 MiB | [Download](https://huggingface.co/datasets/CyberHarem/kurata_mashiro_bangdream/resolve/main/dataset-1200.zip) | IMG+TXT | dataset with the shorter side not exceeding 1200 pixels. |
| stage3-p480-1200 | 564 | 606.77 MiB | [Download](https://huggingface.co/datasets/CyberHarem/kurata_mashiro_bangdream/resolve/main/dataset-stage3-p480-1200.zip) | IMG+TXT | 3-stage cropped dataset with the area not less than 480x480 pixels. |
### Load Raw Dataset with Waifuc
We provide raw dataset (including tagged images) for [waifuc](https://deepghs.github.io/waifuc/main/tutorials/installation/index.html) loading. If you need this, just run the following code
```python
import os
import zipfile
from huggingface_hub import hf_hub_download
from waifuc.source import LocalSource
# download raw archive file
zip_file = hf_hub_download(
repo_id='CyberHarem/kurata_mashiro_bangdream',
repo_type='dataset',
filename='dataset-raw.zip',
)
# extract files to your directory
dataset_dir = 'dataset_dir'
os.makedirs(dataset_dir, exist_ok=True)
with zipfile.ZipFile(zip_file, 'r') as zf:
zf.extractall(dataset_dir)
# load the dataset with waifuc
source = LocalSource(dataset_dir)
for item in source:
print(item.image, item.meta['filename'], item.meta['tags'])
```
## List of Clusters
List of tag clustering result, maybe some outfits can be mined here.
### Raw Text Version
| # | Samples | Img-1 | Img-2 | Img-3 | Img-4 | Img-5 | Tags |
|----:|----------:|:--------------------------------|:--------------------------------|:--------------------------------|:--------------------------------|:--------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 0 | 5 |  |  |  |  |  | 1girl, looking_at_viewer, solo, white_headwear, white_jacket, black_gloves, blush, long_sleeves, open_mouth, simple_background, virtual_youtuber, white_background, black_footwear, earrings, full_body, long_hair, standing, white_shirt, white_socks, :d, aqua_eyes, black_ribbon, blue_hair, boots, green_eyes, holding_microphone, mini_hat, neck_ribbon, white_skirt |
| 1 | 14 |  |  |  |  |  | 1girl, looking_at_viewer, solo, white_headwear, black_gloves, long_sleeves, white_jacket, white_shirt, tilted_headwear, white_skirt, blush, earrings, open_mouth, black_ribbon, blue_butterfly, half_gloves, outstretched_arm, buttons, mini_hat, smile |
| 2 | 7 |  |  |  |  |  | 1girl, blush, solo, earrings, looking_at_viewer, long_sleeves, white_background, black_gloves, blue_hair, jacket, open_mouth, shirt, simple_background, smile, blue_butterfly, closed_mouth, hair_ornament, mini_hat, virtual_youtuber |
| 3 | 24 |  |  |  |  |  | 1girl, solo, blush, looking_at_viewer, long_sleeves, white_sailor_collar, white_background, open_mouth, simple_background, neckerchief, pleated_skirt, smile, upper_body, black_shirt, blue_serafuku |
| 4 | 6 |  |  |  |  |  | 1girl, blush, long_sleeves, looking_at_viewer, solo, blue_dress, neck_ribbon, vertical-striped_dress, blue_hair, simple_background, white_background, white_shirt, blue_ribbon, collared_shirt, open_mouth, standing |
| 5 | 7 |  |  |  |  |  | 1girl, blush, looking_at_viewer, navel, nipples, solo, collarbone, pussy, stomach, completely_nude, large_breasts, medium_hair, sweat, wet, closed_mouth, groin, shiny_skin, simple_background, smile, standing, aqua_eyes, blue_hair, cowboy_shot, grey_background, hand_up, medium_breasts, mosaic_censoring, open_mouth |
### Table Version
| # | Samples | Img-1 | Img-2 | Img-3 | Img-4 | Img-5 | 1girl | looking_at_viewer | solo | white_headwear | white_jacket | black_gloves | blush | long_sleeves | open_mouth | simple_background | virtual_youtuber | white_background | black_footwear | earrings | full_body | long_hair | standing | white_shirt | white_socks | :d | aqua_eyes | black_ribbon | blue_hair | boots | green_eyes | holding_microphone | mini_hat | neck_ribbon | white_skirt | tilted_headwear | blue_butterfly | half_gloves | outstretched_arm | buttons | smile | jacket | shirt | closed_mouth | hair_ornament | white_sailor_collar | neckerchief | pleated_skirt | upper_body | black_shirt | blue_serafuku | blue_dress | vertical-striped_dress | blue_ribbon | collared_shirt | navel | nipples | collarbone | pussy | stomach | completely_nude | large_breasts | medium_hair | sweat | wet | groin | shiny_skin | cowboy_shot | grey_background | hand_up | medium_breasts | mosaic_censoring |
|----:|----------:|:--------------------------------|:--------------------------------|:--------------------------------|:--------------------------------|:--------------------------------|:--------|:--------------------|:-------|:-----------------|:---------------|:---------------|:--------|:---------------|:-------------|:--------------------|:-------------------|:-------------------|:-----------------|:-----------|:------------|:------------|:-----------|:--------------|:--------------|:-----|:------------|:---------------|:------------|:--------|:-------------|:---------------------|:-----------|:--------------|:--------------|:------------------|:-----------------|:--------------|:-------------------|:----------|:--------|:---------|:--------|:---------------|:----------------|:----------------------|:--------------|:----------------|:-------------|:--------------|:----------------|:-------------|:-------------------------|:--------------|:-----------------|:--------|:----------|:-------------|:--------|:----------|:------------------|:----------------|:--------------|:--------|:------|:--------|:-------------|:--------------|:------------------|:----------|:-----------------|:-------------------|
| 0 | 5 |  |  |  |  |  | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 1 | 14 |  |  |  |  |  | X | X | X | X | X | X | X | X | X | | | | | X | | | | X | | | | X | | | | | X | | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 2 | 7 |  |  |  |  |  | X | X | X | | | X | X | X | X | X | X | X | | X | | | | | | | | | X | | | | X | | | | X | | | | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 3 | 24 |  |  |  |  |  | X | X | X | | | | X | X | X | X | | X | | | | | | | | | | | | | | | | | | | | | | | X | | | | | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | |
| 4 | 6 |  |  |  |  |  | X | X | X | | | | X | X | X | X | | X | | | | | X | X | | | | | X | | | | | X | | | | | | | | | | | | | | | | | | X | X | X | X | | | | | | | | | | | | | | | | | |
| 5 | 7 |  |  |  |  |  | X | X | X | | | | X | | X | X | | | | | | | X | | | | X | | X | | | | | | | | | | | | X | | | X | | | | | | | | | | | | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
提供机构:
CyberHarem
原始信息汇总
倉田ましろ (BanG Dream!) 数据集
数据集概述
该数据集包含230张倉田ましろ(BanG Dream!)的图像及其标签。核心标签包括:bangs, blue_eyes, hair_between_eyes, short_hair, breasts, white_hair。
数据集包列表
| 名称 | 图像数量 | 大小 | 类型 | 描述 |
|---|---|---|---|---|
| raw | 230 | 344.30 MiB | Waifuc-Raw | 包含元信息的原始数据(最小边对齐到1400像素,如果更大)。 |
| 800 | 230 | 185.56 MiB | IMG+TXT | 短边不超过800像素的数据集。 |
| stage3-p480-800 | 564 | 404.90 MiB | IMG+TXT | 3阶段裁剪数据集,区域不小于480x480像素。 |
| 1200 | 230 | 298.91 MiB | IMG+TXT | 短边不超过1200像素的数据集。 |
| stage3-p480-1200 | 564 | 606.77 MiB | IMG+TXT | 3阶段裁剪数据集,区域不小于480x480像素。 |
标签聚类结果
以下是标签聚类结果的示例:
原始文本版本
| # | 样本数 | 图像1 | 图像2 | 图像3 | 图像4 | 图像5 | 标签 |
|---|---|---|---|---|---|---|---|
| 0 | 5 | ![]() |
![]() |
![]() |
![]() |
![]() |
1girl, looking_at_viewer, solo, white_headwear, white_jacket, black_gloves, blush, long_sleeves, open_mouth, simple_background, virtual_youtuber, white_background, black_footwear, earrings, full_body, long_hair, standing, white_shirt, white_socks, :d, aqua_eyes, black_ribbon, blue_hair, boots, green_eyes, holding_microphone, mini_hat, neck_ribbon, white_skirt |
| 1 | 14 | ![]() |
![]() |
![]() |
![]() |
![]() |
1girl, looking_at_viewer, solo, white_headwear, black_gloves, long_sleeves, white_jacket, white_shirt, tilted_headwear, white_skirt, blush, earrings, open_mouth, black_ribbon, blue_butterfly, half_gloves, outstretched_arm, buttons, mini_hat, smile |
| 2 | 7 | ![]() |
![]() |
![]() |
![]() |
![]() |
1girl, blush, solo, earrings, looking_at_viewer, long_sleeves, white_background, black_gloves, blue_hair, jacket, open_mouth, shirt, simple_background, smile, blue_butterfly, closed_mouth, hair_ornament, mini_hat, virtual_youtuber |
| 3 | 24 | ![]() |
![]() |
![]() |
![]() |
![]() |
1girl, solo, blush, looking_at_viewer, long_sleeves, white_sailor_collar, white_background, open_mouth, simple_background, neckerchief, pleated_skirt, smile, upper_body, black_shirt, blue_serafuku |
| 4 | 6 | ![]() |
![]() |
![]() |
![]() |
![]() |
1girl, blush, long_sleeves, looking_at_viewer, solo, blue_dress, neck_ribbon, vertical-striped_dress, blue_hair, simple_background, white_background, white_shirt, blue_ribbon, collared_shirt, open_mouth, standing |
| 5 | 7 | ![]() |
![]() |
![]() |
![]() |
![]() |
1girl, blush, looking_at_viewer, navel, nipples, solo, collarbone, pussy, stomach, completely_nude, large_breasts, medium_hair, sweat, wet, closed_mouth, groin, shiny_skin, simple_background, smile, standing, aqua_eyes, blue_hair, cowboy_shot, grey_background, hand_up, medium_breasts, mosaic_censoring, open_mouth |
搜集汇总
数据集介绍

构建方式
在动漫角色数据集的构建领域,CyberHarem/kurata_mashiro_bangdream 数据集通过自动化爬虫系统,从多个知名图像平台如 Danbooru、Pixiv 和 Zerochan 等系统性地采集了 230 幅仓田ましろ的角色图像。原始数据经过预处理,确保图像的最小边缘对齐至 1400 像素,同时剔除了角色的核心标签,如 bangs、blue_eyes 等,以优化数据纯度。该过程由 DeepGHS 团队技术支持,保证了数据来源的多样性与采集效率,为后续的文本到图像生成任务奠定了高质量基础。
使用方法
为高效利用该数据集,用户可通过 Hugging Face Hub 直接下载原始或预处理版本,其中原始数据包兼容 Waifuc 工具链,便于本地加载与扩展处理。使用提供的 Python 代码示例,可轻松解压并遍历图像及其元标签,集成至现有训练流程。对于生成模型开发,建议根据计算资源选择适当分辨率版本,或参考聚类结果进行定向数据筛选,以优化模型在特定视觉特征上的表现。
背景与挑战
背景概述
在动漫角色图像生成领域,高质量、细粒度标注的数据集对于推动文本到图像生成模型的发展至关重要。CyberHarem/kurata_mashiro_bangdream数据集由DeepGHS团队构建,专注于《BanG Dream!》系列中的角色倉田ましろ,收录了230张图像及其对应标签。该数据集通过自动化爬虫系统从多个知名图像平台采集数据,并经过精心处理,提供了多种分辨率版本及聚类分析结果,旨在为动漫风格图像生成研究提供精准的素材支持,助力角色一致性、细节还原等核心问题的探索。
当前挑战
该数据集致力于解决动漫角色图像生成中风格一致性与细节还原的挑战,尤其在处理特定角色的多样化姿态、服饰及表情时,需确保生成结果符合原作设定。构建过程中,面临图像来源异构性带来的质量参差问题,需通过自动化系统进行筛选与对齐;同时,标签体系的构建需平衡语义粒度与实用性,避免噪声干扰。此外,数据规模的有限性可能制约模型泛化能力,而敏感内容的处理亦需符合伦理规范。
常用场景
经典使用场景
在动漫角色生成与风格化图像合成领域,CyberHarem/kurata_mashiro_bangdream数据集以其精心标注的视觉特征,为文本到图像生成模型提供了高质量的微调素材。该数据集聚焦于《BanG Dream!》中的角色倉田ましろ,通过230张图像及其精细化标签,构建了从发型、瞳色到服饰细节的完整视觉描述体系。研究者可借助此类数据集,探索生成对抗网络或扩散模型在特定角色一致性生成上的表现,实现从自然语言描述到高度风格化动漫图像的精准映射。
解决学术问题
该数据集针对动漫图像生成中角色特征保持与细节还原的学术难题,提供了结构化的解决方案。通过去重核心标签并保留多样化场景标注,它有效缓解了生成模型在训练过程中常见的特征混淆与细节丢失问题。在跨域风格迁移、少样本图像生成等研究方向,此类数据集能够支撑模型学习特定艺术风格的语义分布,促进生成内容在审美一致性与多样性之间的平衡,为数字艺术创作自动化提供了可复现的实验基准。
实际应用
在虚拟偶像内容创作与二次元衍生设计产业中,该数据集展现出直接的应用价值。动画工作室或独立创作者可基于已标注的图像数据,快速构建角色专属的风格化生成工具,用于量产宣传插图、周边商品设计或动态表情包。结合现代生成式人工智能技术,此类数据集能够显著降低角色一致性视觉内容的生产成本,同时为粉丝创作社区提供合规且高质量的角色参考素材,推动ACG文化内容的数字化创新。
数据集最近研究
最新研究方向
在动漫风格图像生成领域,针对特定角色的高质量数据集正成为研究热点。该数据集聚焦于《BanG Dream!》中的仓田真白角色,通过精心标注的图像与标签,为个性化角色生成提供了精准的训练素材。当前研究前沿集中于利用此类数据集探索少样本学习与风格迁移技术,旨在实现角色特征的高保真复现与多样化姿态合成。随着虚拟偶像产业的蓬勃发展,这类数据集在推动个性化内容生成、提升生成模型可控性方面展现出重要价值,为动漫艺术创作与人工智能的深度融合开辟了新路径。
以上内容由遇见数据集搜集并总结生成

































