CyberHarem/sonoda_chiyoko_theidolmstershinycolors
收藏Hugging Face2024-01-16 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/CyberHarem/sonoda_chiyoko_theidolmstershinycolors
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-to-image
tags:
- art
- not-for-all-audiences
size_categories:
- n<1K
---
# Dataset of sonoda_chiyoko/園田智代子 (THE iDOLM@STER: SHINY COLORS)
This is the dataset of sonoda_chiyoko/園田智代子 (THE iDOLM@STER: SHINY COLORS), containing 500 images and their tags.
The core tags of this character are `brown_hair, bangs, breasts, twintails, hair_bun, double_bun, red_eyes, large_breasts, brown_eyes, long_hair`, which are pruned in this dataset.
Images are crawled from many sites (e.g. danbooru, pixiv, zerochan ...), the auto-crawling system is powered by [DeepGHS Team](https://github.com/deepghs)([huggingface organization](https://huggingface.co/deepghs)).
## List of Packages
| Name | Images | Size | Download | Type | Description |
|:-----------------|---------:|:-----------|:----------------------------------------------------------------------------------------------------------------------------------------|:-----------|:---------------------------------------------------------------------|
| raw | 500 | 824.87 MiB | [Download](https://huggingface.co/datasets/CyberHarem/sonoda_chiyoko_theidolmstershinycolors/resolve/main/dataset-raw.zip) | Waifuc-Raw | Raw data with meta information (min edge aligned to 1400 if larger). |
| 800 | 500 | 415.43 MiB | [Download](https://huggingface.co/datasets/CyberHarem/sonoda_chiyoko_theidolmstershinycolors/resolve/main/dataset-800.zip) | IMG+TXT | dataset with the shorter side not exceeding 800 pixels. |
| stage3-p480-800 | 1280 | 944.73 MiB | [Download](https://huggingface.co/datasets/CyberHarem/sonoda_chiyoko_theidolmstershinycolors/resolve/main/dataset-stage3-p480-800.zip) | IMG+TXT | 3-stage cropped dataset with the area not less than 480x480 pixels. |
| 1200 | 500 | 712.46 MiB | [Download](https://huggingface.co/datasets/CyberHarem/sonoda_chiyoko_theidolmstershinycolors/resolve/main/dataset-1200.zip) | IMG+TXT | dataset with the shorter side not exceeding 1200 pixels. |
| stage3-p480-1200 | 1280 | 1.42 GiB | [Download](https://huggingface.co/datasets/CyberHarem/sonoda_chiyoko_theidolmstershinycolors/resolve/main/dataset-stage3-p480-1200.zip) | IMG+TXT | 3-stage cropped dataset with the area not less than 480x480 pixels. |
### Load Raw Dataset with Waifuc
We provide raw dataset (including tagged images) for [waifuc](https://deepghs.github.io/waifuc/main/tutorials/installation/index.html) loading. If you need this, just run the following code
```python
import os
import zipfile
from huggingface_hub import hf_hub_download
from waifuc.source import LocalSource
# download raw archive file
zip_file = hf_hub_download(
repo_id='CyberHarem/sonoda_chiyoko_theidolmstershinycolors',
repo_type='dataset',
filename='dataset-raw.zip',
)
# extract files to your directory
dataset_dir = 'dataset_dir'
os.makedirs(dataset_dir, exist_ok=True)
with zipfile.ZipFile(zip_file, 'r') as zf:
zf.extractall(dataset_dir)
# load the dataset with waifuc
source = LocalSource(dataset_dir)
for item in source:
print(item.image, item.meta['filename'], item.meta['tags'])
```
## List of Clusters
List of tag clustering result, maybe some outfits can be mined here.
### Raw Text Version
| # | Samples | Img-1 | Img-2 | Img-3 | Img-4 | Img-5 | Tags |
|----:|----------:|:----------------------------------|:----------------------------------|:----------------------------------|:----------------------------------|:----------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 0 | 57 |  |  |  |  |  | 1girl, neck_ribbon, yellow_ribbon, school_uniform, solo, long_sleeves, looking_at_viewer, blush, white_shirt, blazer, black_jacket, white_background, plaid_skirt, simple_background, smile, sweater_vest, brown_skirt, holding, pleated_skirt, food, collared_shirt |
| 1 | 12 |  |  |  |  |  | 1girl, blush, completely_nude, nipples, solo, looking_at_viewer, navel, open_mouth, collarbone, sweat, simple_background, white_background, female_pubic_hair, smile |
| 2 | 11 |  |  |  |  |  | 1boy, 1girl, hetero, nipples, solo_focus, blush, penis, sex, vaginal, navel, sweat, looking_at_viewer, open_mouth, cowgirl_position, girl_on_top, pov, collarbone, completely_nude, cum_in_pussy, mosaic_censoring, spread_legs, bar_censor, cum_on_breasts, heart, shirt_lift, smile, white_shirt |
| 3 | 17 |  |  |  |  |  | 1girl, cleavage, solo, blush, looking_at_viewer, navel, smile, collarbone, frilled_bikini, bare_shoulders, necklace, bracelet, holding, medium_breasts, open_mouth, polka_dot, simple_background, blue_sky, food-themed_hair_ornament, white_background |
| 4 | 5 |  |  |  |  |  | 1girl, looking_at_viewer, navel, solo, underwear_only, bare_arms, bare_shoulders, blush, cleavage, collarbone, smile, closed_mouth, medium_breasts, pink_bra, pink_panties, simple_background, sitting, white_background, bow, mouth_hold, short_twintails |
| 5 | 7 |  |  |  |  |  | 1girl, bare_shoulders, cleavage, looking_at_viewer, solo, tube_top, blush, collarbone, midriff, navel, necklace, off_shoulder, open_jacket, pink_jacket, tongue_out, hair_ribbon, simple_background, smile, crop_top, holding_food, lock, medium_breasts, miniskirt, pleated_skirt, upper_body, white_background, white_belt |
| 6 | 7 |  |  |  |  |  | 1girl, bare_shoulders, black_sweater, blush, brown_jacket, hairclip, off_shoulder, ribbed_sweater, sleeveless_sweater, upper_body, simple_background, sleeveless_turtleneck, solo, turtleneck_sweater, white_background, x_hair_ornament, heart_earrings, looking_at_viewer, heart_necklace, long_sleeves, short_twintails, smile |
| 7 | 8 |  |  |  |  |  | 1girl, brown_jacket, brown_skirt, hairclip, looking_at_viewer, off_shoulder, plaid_skirt, solo, bare_shoulders, long_sleeves, necklace, ribbed_sweater, short_twintails, sleeveless_sweater, blush, earrings, x_hair_ornament, black_sweater, heart, miniskirt, smile, white_background, open_jacket, simple_background, sleeveless_turtleneck, sleeves_past_wrists |
| 8 | 9 |  |  |  |  |  | 1girl, looking_at_viewer, one_eye_closed, ;d, blue_skirt, open_mouth, short_twintails, smile, bow, jacket, solo, ribbon, wrist_scrunchie, armband, holding_microphone, plaid_skirt, blue_scrunchie, blush, frilled_skirt, long_sleeves, simple_background, standing_on_one_leg |
| 9 | 19 |  |  |  |  |  | 1girl, demon_horns, blush, looking_at_viewer, solo, bare_shoulders, cleavage, black_gloves, covered_navel, open_mouth, simple_background, striped, demon_wings, see-through, star_earrings, asymmetrical_legwear, necklace, purple_skirt, smile, star_hair_ornament, torn_clothes, crop_top, facial_mark, lollipop, sticker_on_face, white_background, halloween_costume, midriff, polka_dot, single_thighhigh |
| 10 | 8 |  |  |  |  |  | 1girl, frills, looking_at_viewer, solo, blush, puffy_short_sleeves, dress, bow, food, open_mouth, simple_background, :d, heart, ribbon, white_background, white_gloves |
| 11 | 5 |  |  |  |  |  | 1girl, bare_shoulders, detached_collar, fake_animal_ears, looking_at_viewer, playboy_bunny, rabbit_ears, strapless_leotard, wrist_cuffs, black_bowtie, black_leotard, solo, cleavage, closed_mouth, collarbone, short_twintails, white_collar, areola_slip, arm_up, bare_arms, bed_sheet, covered_navel, dot_nose, egg_vibrator, groin, hairband, hand_up, highleg, indoors, kneeling, no_shoes, nose_blush, on_bed, open_mouth, raised_eyebrows, see-through_leotard, simple_background, skindentation, smile, thighs, white_background, white_thighhighs |
### Table Version
| # | Samples | Img-1 | Img-2 | Img-3 | Img-4 | Img-5 | 1girl | neck_ribbon | yellow_ribbon | school_uniform | solo | long_sleeves | looking_at_viewer | blush | white_shirt | blazer | black_jacket | white_background | plaid_skirt | simple_background | smile | sweater_vest | brown_skirt | holding | pleated_skirt | food | collared_shirt | completely_nude | nipples | navel | open_mouth | collarbone | sweat | female_pubic_hair | 1boy | hetero | solo_focus | penis | sex | vaginal | cowgirl_position | girl_on_top | pov | cum_in_pussy | mosaic_censoring | spread_legs | bar_censor | cum_on_breasts | heart | shirt_lift | cleavage | frilled_bikini | bare_shoulders | necklace | bracelet | medium_breasts | polka_dot | blue_sky | food-themed_hair_ornament | underwear_only | bare_arms | closed_mouth | pink_bra | pink_panties | sitting | bow | mouth_hold | short_twintails | tube_top | midriff | off_shoulder | open_jacket | pink_jacket | tongue_out | hair_ribbon | crop_top | holding_food | lock | miniskirt | upper_body | white_belt | black_sweater | brown_jacket | hairclip | ribbed_sweater | sleeveless_sweater | sleeveless_turtleneck | turtleneck_sweater | x_hair_ornament | heart_earrings | heart_necklace | earrings | sleeves_past_wrists | one_eye_closed | ;d | blue_skirt | jacket | ribbon | wrist_scrunchie | armband | holding_microphone | blue_scrunchie | frilled_skirt | standing_on_one_leg | demon_horns | black_gloves | covered_navel | striped | demon_wings | see-through | star_earrings | asymmetrical_legwear | purple_skirt | star_hair_ornament | torn_clothes | facial_mark | lollipop | sticker_on_face | halloween_costume | single_thighhigh | frills | puffy_short_sleeves | dress | :d | white_gloves | detached_collar | fake_animal_ears | playboy_bunny | rabbit_ears | strapless_leotard | wrist_cuffs | black_bowtie | black_leotard | white_collar | areola_slip | arm_up | bed_sheet | dot_nose | egg_vibrator | groin | hairband | hand_up | highleg | indoors | kneeling | no_shoes | nose_blush | on_bed | raised_eyebrows | see-through_leotard | skindentation | thighs | white_thighhighs |
|----:|----------:|:----------------------------------|:----------------------------------|:----------------------------------|:----------------------------------|:----------------------------------|:--------|:--------------|:----------------|:-----------------|:-------|:---------------|:--------------------|:--------|:--------------|:---------|:---------------|:-------------------|:--------------|:--------------------|:--------|:---------------|:--------------|:----------|:----------------|:-------|:-----------------|:------------------|:----------|:--------|:-------------|:-------------|:--------|:--------------------|:-------|:---------|:-------------|:--------|:------|:----------|:-------------------|:--------------|:------|:---------------|:-------------------|:--------------|:-------------|:-----------------|:--------|:-------------|:-----------|:-----------------|:-----------------|:-----------|:-----------|:-----------------|:------------|:-----------|:----------------------------|:-----------------|:------------|:---------------|:-----------|:---------------|:----------|:------|:-------------|:------------------|:-----------|:----------|:---------------|:--------------|:--------------|:-------------|:--------------|:-----------|:---------------|:-------|:------------|:-------------|:-------------|:----------------|:---------------|:-----------|:-----------------|:---------------------|:------------------------|:---------------------|:------------------|:-----------------|:-----------------|:-----------|:----------------------|:-----------------|:-----|:-------------|:---------|:---------|:------------------|:----------|:---------------------|:-----------------|:----------------|:----------------------|:--------------|:---------------|:----------------|:----------|:--------------|:--------------|:----------------|:-----------------------|:---------------|:---------------------|:---------------|:--------------|:-----------|:------------------|:--------------------|:-------------------|:---------|:----------------------|:--------|:-----|:---------------|:------------------|:-------------------|:----------------|:--------------|:--------------------|:--------------|:---------------|:----------------|:---------------|:--------------|:---------|:------------|:-----------|:---------------|:--------|:-----------|:----------|:----------|:----------|:-----------|:-----------|:-------------|:---------|:------------------|:----------------------|:----------------|:---------|:-------------------|
| 0 | 57 |  |  |  |  |  | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 1 | 12 |  |  |  |  |  | X | | | | X | | X | X | | | | X | | X | X | | | | | | | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 2 | 11 |  |  |  |  |  | X | | | | | | X | X | X | | | | | | X | | | | | | | X | X | X | X | X | X | | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 3 | 17 |  |  |  |  |  | X | | | | X | | X | X | | | | X | | X | X | | | X | | | | | | X | X | X | | | | | | | | | | | | | | | | | | | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 4 | 5 |  |  |  |  |  | X | | | | X | | X | X | | | | X | | X | X | | | | | | | | | X | | X | | | | | | | | | | | | | | | | | | | X | | X | | | X | | | | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 5 | 7 |  |  |  |  |  | X | | | | X | | X | X | | | | X | | X | X | | | | X | | | | | X | | X | | | | | | | | | | | | | | | | | | | X | | X | X | | X | | | | | | | | | | | | | X | X | X | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 6 | 7 |  |  |  |  |  | X | | | | X | X | X | X | | | | X | | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X | | | | | | | | | | | | | | | X | | | X | | | | | | | | | X | | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 7 | 8 |  |  |  |  |  | X | | | | X | X | X | X | | | | X | X | X | X | | X | | | | | | | | | | | | | | | | | | | | | | | | | | X | | | | X | X | | | | | | | | | | | | | | X | | | X | X | | | | | | | X | | | X | X | X | X | X | X | | X | | | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 8 | 9 |  |  |  |  |  | X | | | | X | X | X | X | | | | | X | X | X | | | | | | | | | | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X | | X | | | | | | | | | | | | | | | | | | | | | | | | | | X | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 9 | 19 |  |  |  |  |  | X | | | | X | | X | X | | | | X | | X | X | | | | | | | | | | X | | | | | | | | | | | | | | | | | | | | X | | X | X | | | X | | | | | | | | | | | | | X | | | | | | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 10 | 8 |  |  |  |  |  | X | | | | X | | X | X | | | | X | | X | | | | | | X | | | | | X | | | | | | | | | | | | | | | | | | X | | | | | | | | | | | | | | | | | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X | | | | | | | | | | | | | | | | | | | | | | | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 11 | 5 |  |  |  |  |  | X | | | | X | | X | | | | | X | | X | X | | | | | | | | | | X | X | | | | | | | | | | | | | | | | | | | X | | X | | | | | | | | X | X | | | | | | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X | | | | | | | | | | | | | | | | | | | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
提供机构:
CyberHarem
原始信息汇总
数据集概述
数据集名称
Dataset of sonoda_chiyoko/園田智代子 (THE iDOLM@STER: SHINY COLORS)
数据集描述
该数据集包含500张園田智代子(THE iDOLM@STER: SHINY COLORS)的图像及其标签。图像主要特征包括棕色头发、刘海、胸部、双马尾、发髻、双发髻、红眼睛、大胸部、棕色眼睛、长发等。
数据来源
图像从多个网站(如danbooru、pixiv、zerochan等)爬取,爬虫系统由DeepGHS Team开发。
数据集包列表
| 名称 | 图像数量 | 大小 | 类型 | 描述 |
|---|---|---|---|---|
| raw | 500 | 824.87 MiB | Waifuc-Raw | 包含元信息的原始数据(最小边对齐到1400像素,如果更大)。 |
| 800 | 500 | 415.43 MiB | IMG+TXT | 短边不超过800像素的数据集。 |
| stage3-p480-800 | 1280 | 944.73 MiB | IMG+TXT | 3阶段裁剪数据集,区域不小于480x480像素。 |
| 1200 | 500 | 712.46 MiB | IMG+TXT | 短边不超过1200像素的数据集。 |
| stage3-p480-1200 | 1280 | 1.42 GiB | IMG+TXT | 3阶段裁剪数据集,区域不小于480x480像素。 |
标签聚类结果
| # | 样本数量 | 图像示例 | 标签 |
|---|---|---|---|
| 0 | 57 | ![]() |
1girl, neck_ribbon, yellow_ribbon, school_uniform, solo, long_sleeves, looking_at_viewer, blush, white_shirt, blazer, black_jacket, white_background, plaid_skirt, simple_background, smile, sweater_vest, brown_skirt, holding, pleated_skirt, food, collared_shirt |
| 1 | 12 | ![]() |
1girl, blush, completely_nude, nipples, solo, looking_at_viewer, navel, open_mouth, collarbone, sweat, simple_background, white_background, female_pubic_hair, smile |
| 2 | 11 | ![]() |
1boy, 1girl, hetero, nipples, solo_focus, blush, penis, sex, vaginal, navel, sweat, looking_at_viewer, open_mouth, cowgirl_position, girl_on_top, pov, collarbone, completely_nude, cum_in_pussy, mosaic_censoring, spread_legs, bar_censor, cum_on_breasts, heart, shirt_lift, smile, white_shirt |
| 3 | 17 | ![]() |
1girl, cleavage, solo, blush, looking_at_viewer, navel, smile, collarbone, frilled_bikini, bare_shoulders, necklace, bracelet, holding, medium_breasts, open_mouth, polka_dot, simple_background, blue_sky, food-themed_hair_ornament, white_background |
| 4 | 5 | ![]() |
1girl, looking_at_viewer, navel, solo, underwear_only, bare_arms, bare_shoulders, blush, cleavage, collarbone, smile, closed_mouth, medium_breasts, pink_bra, pink_panties, simple_background, sitting, white_background, bow, mouth_hold, short_twintails |
| 5 | 7 | ![]() |
1girl, bare_shoulders, cleavage, looking_at_viewer, solo, tube_top, blush, collarbone, midriff, navel, necklace, off_shoulder, open_jacket, pink_jacket, tongue_out, hair_ribbon, simple_background, smile, crop_top, holding_food, lock, medium_breasts, miniskirt, pleated_skirt, upper_body, white_background, white_belt |
| 6 | 7 | ![]() |
1girl, bare_shoulders, black_sweater, blush, brown_jacket, hairclip, off_shoulder, ribbed_sweater, sleeveless_sweater, upper_body, simple_background, sleeveless_turtleneck, solo, turtleneck_sweater, white_background, x_hair_ornament, heart_earrings, looking_at_viewer, heart_necklace, long_sleeves, short_twintails, smile |
| 7 | 8 | ![]() |
1girl, brown_jacket, brown_skirt, hairclip, looking_at_viewer, off_shoulder, plaid_skirt, solo, bare_shoulders, long_sleeves, necklace, ribbed_sweater, short_twintails, sleeveless_sweater, blush, earrings, x_hair_ornament, black_sweater, heart, miniskirt, smile, white_background, open_jacket, simple_background, sleeveless_turtleneck, sleeves_past_wrists |
| 8 | 9 | ![]() |
1girl, looking_at_viewer, one_eye_closed, ;d, blue_skirt, open_mouth, short_twintails, smile, bow, jacket, solo, ribbon, wrist_scrunchie, armband, holding_microphone, plaid_skirt, blue_scrunchie, blush, frilled_skirt, long_sleeves, simple_background, standing_on_one_leg |
| 9 | 19 | ![]() |
1girl, demon_horns, blush, looking_at_viewer, solo, bare_shoulders, cleavage, black_gloves, covered_navel, open_mouth, simple_background, striped, demon_wings, see-through, star_earrings, asymmetrical_legwear, necklace, purple_skirt, smile, star_hair_ornament, torn_clothes, crop_top, facial_mark, lollipop, sticker_on_face, white_background, halloween_costume, midriff, polka_dot, single_thighhigh |
| 10 | 8 | ![]() |
1girl, frills, looking_at_viewer, solo, blush, puffy_short_sleeves, dress, bow, food, open_mouth, simple_background, :d, heart, ribbon, white_background, white_gloves |
| 11 | 5 | ![]() |
1girl, bare_shoulders, detached_collar, fake_animal_ears, looking_at_viewer, playboy_bunny, rabbit_ears, strapless_leotard, wrist_cuffs, black_bowtie, black_leotard, solo, cleavage, closed_mouth, collarbone, short_twintails, white_collar, areola_slip, arm_up, bare_arms, bed_sheet, covered_navel, dot_nose, egg_vibrator, groin, hairband, hand_up, highleg, indoors, kneeling, no_shoes, nose_blush, on_bed, open_mouth, raised_eyebrows, see-through_leotard, simple_background, skindentation, smile, thighs, white_background, white_thighhighs |
搜集汇总
数据集介绍

构建方式
在二次元角色数据集构建领域,针对《偶像大师 闪耀色彩》中園田智代子这一角色,本数据集通过自动化爬取系统从Danbooru、Pixiv、Zerochan等多个图像社区采集了500张高质量图像及其关联标签。原始数据经过预处理,核心特征标签如棕色长发、双马尾、红色眼眸等已被剔除,以避免冗余信息干扰模型训练。数据集提供了多种规格的压缩包,包括原始元数据包(raw)和短边不超过800或1200像素的标准化版本,此外还推出了基于三级裁剪策略的增强版(stage3),确保图像区域不小于480×480像素,以适应不同训练需求。整个构建流程由DeepGHS团队开发的Waifuc框架驱动,确保了数据采集的高效性与可复现性。
使用方法
用户可通过HuggingFace Hub直接下载压缩包,或利用Waifuc库以编程方式加载原始数据集。推荐使用Python环境,通过huggingface_hub库下载dataset-raw.zip后解压至本地目录,再借助Waifuc的LocalSource接口遍历图像与标签。对于进阶用户,可依据聚类表选择特定主题的子集进行微调,例如聚焦于校服装扮或泳装形象的生成。数据集兼容主流文生图框架如Stable Diffusion,只需将图像与对应的TXT标签文件配对即可用于训练。所有包均以IMG+TXT格式提供,确保了与Diffusers等库的无缝集成。
背景与挑战
背景概述
在二次元角色图像生成领域,高质量、标注精细的数据集是驱动文本到图像模型性能提升的核心基石。CyberHarem/sonoda_chiyoko_theidolmstershinycolors 数据集由 DeepGHS 团队于近期创建,聚焦于日本偶像企划《偶像大师 闪耀色彩》中的角色園田智代子。该数据集包含 500 张从 Danbooru、Pixiv、Zerochan 等多源平台自动爬取的高清图像,并附有详尽的多标签标注。其核心研究问题在于如何通过结构化、多尺度的数据组织(如 raw、800、1200 像素及三阶段裁剪版本)服务于 Waifuc 等框架下的模型训练,特别是在角色一致性、服饰多样性及姿态泛化方面。该数据集通过自动爬取与标签聚类技术,为角色定制化图像生成任务提供了可复用的基准资源,对虚拟角色数据集的标准化构建具有示范意义。
当前挑战
该数据集面临的挑战首先体现在领域问题层面:文本到图像生成模型在处理特定虚构角色时,需要克服角色身份特征(如发型、瞳色、服饰)的精确保持与复杂场景下的语义解耦难题。此外,数据集中包含的裸露与性暗示内容(如 cluster 1、2 所示)带来了伦理与使用限制的双重挑战,要求模型在生成时具备内容过滤与情境感知能力。在构建过程中,挑战则集中于自动爬取系统的数据质量控制:多源图像的分辨率、水印、噪声差异需通过边缘对齐与裁剪策略(如 stage3-p480-800)来统一;标签聚类结果(如 12 个服饰群组)的语义粒度与歧义消解亦需人工校验,以确保训练数据的准确性与低噪声水平。
常用场景
经典使用场景
在动漫角色生成与风格迁移领域,CyberHarem/sonoda_chiyoko_theidolmstershinycolors数据集凭借其高保真度的图像-标签对结构,成为文本到图像(text-to-image)任务中不可或缺的精细化训练资源。该数据集聚焦于《偶像大师 闪耀色彩》中的角色園田智代子,收录了500张经过多源爬取与严格筛选的图片,并附带了包括发型、服饰、姿态在内的核心标签。研究者常利用其提供的多尺度版本(如800、1200像素)及三阶段裁剪数据,以微调扩散模型或生成对抗网络,从而在保持角色特征一致性的前提下,实现个性化风格渲染与复杂场景的精准生成。
解决学术问题
该数据集有效回应了动漫图像生成中角色身份保持与标签噪声抑制的学术挑战。通过提供结构化标签聚类结果(如校服、泳装、万圣节主题等),它使研究者能够系统性地探索细粒度属性解耦与多模态对齐问题。此外,数据集对裸露及成人内容的明确标注,为NSFW内容检测与过滤算法提供了基准测试资源,推动了安全生成模型的发展。其意义在于,为动漫领域稀缺的高质量、多标签数据集填补了空白,促进了可控生成与风格一致性研究的深入,并成为评估生成模型对特定角色泛化能力的重要基石。
实际应用
在实际应用层面,该数据集为虚拟偶像内容创作与二次元文化传播提供了技术支撑。游戏开发商与同人创作者可借助其训练的模型,批量生成符合角色设定的宣传物料、表情包或互动场景,显著降低美术成本。同时,数据集中的聚类标签(如‘demon_horns’、‘playboy_bunny’)可被用于电商平台的定制化商品设计,例如自动生成带有角色元素的服饰图案。此外,其与waifuc库的无缝集成,使得动漫社区能够基于本地化数据快速迭代风格化模型,从而在直播、社交媒体等场景中实现实时角色换装与特效生成。
数据集最近研究
最新研究方向
在虚拟偶像与二次元文化深度融合的浪潮下,以《偶像大师:闪耀色彩》中園田智代子为代表的角色数据集,正成为文本到图像生成领域的前沿研究焦点。该数据集通过自动化爬虫系统从Danbooru、Pixiv等多源平台采集500张高质量图像,并配套精细化标签体系,为动漫风格的角色生成模型提供了标准化训练素材。当前研究方向聚焦于两翼:一是利用多尺度裁剪(如480×800像素)与分阶段处理技术,优化模型对角色特征(如双马尾、棕色长发)的精准捕捉与细节还原;二是通过标签聚类分析(如校服、泳装、万圣节服饰等12个簇类),探索角色在不同着装与场景下的语义一致性生成。这一工作不仅推动了LoRA等轻量化微调技术在二次元内容创作中的应用,更呼应了虚拟偶像产业中个性化角色复现与风格迁移的热点需求,为AI辅助创作在娱乐领域的落地提供了可复用的数据范式。
以上内容由遇见数据集搜集并总结生成















