CyberHarem/shinano_azurlane
收藏Hugging Face2024-01-12 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/CyberHarem/shinano_azurlane
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-to-image
tags:
- art
- not-for-all-audiences
size_categories:
- n<1K
---
# Dataset of shinano/信濃/信浓 (Azur Lane)
This is the dataset of shinano/信濃/信浓 (Azur Lane), containing 500 images and their tags.
The core tags of this character are `animal_ears, long_hair, breasts, fox_ears, animal_ear_fluff, large_breasts, tail, fox_girl, fox_tail, bangs, blue_eyes, white_hair, multiple_tails, very_long_hair, grey_hair, hair_ornament, purple_eyes, white_tail`, which are pruned in this dataset.
Images are crawled from many sites (e.g. danbooru, pixiv, zerochan ...), the auto-crawling system is powered by [DeepGHS Team](https://github.com/deepghs)([huggingface organization](https://huggingface.co/deepghs)).
## List of Packages
| Name | Images | Size | Download | Type | Description |
|:-----------------|---------:|:-----------|:------------------------------------------------------------------------------------------------------------------|:-----------|:---------------------------------------------------------------------|
| raw | 500 | 1.03 GiB | [Download](https://huggingface.co/datasets/CyberHarem/shinano_azurlane/resolve/main/dataset-raw.zip) | Waifuc-Raw | Raw data with meta information (min edge aligned to 1400 if larger). |
| 800 | 500 | 483.13 MiB | [Download](https://huggingface.co/datasets/CyberHarem/shinano_azurlane/resolve/main/dataset-800.zip) | IMG+TXT | dataset with the shorter side not exceeding 800 pixels. |
| stage3-p480-800 | 1359 | 1.08 GiB | [Download](https://huggingface.co/datasets/CyberHarem/shinano_azurlane/resolve/main/dataset-stage3-p480-800.zip) | IMG+TXT | 3-stage cropped dataset with the area not less than 480x480 pixels. |
| 1200 | 500 | 876.72 MiB | [Download](https://huggingface.co/datasets/CyberHarem/shinano_azurlane/resolve/main/dataset-1200.zip) | IMG+TXT | dataset with the shorter side not exceeding 1200 pixels. |
| stage3-p480-1200 | 1359 | 1.67 GiB | [Download](https://huggingface.co/datasets/CyberHarem/shinano_azurlane/resolve/main/dataset-stage3-p480-1200.zip) | IMG+TXT | 3-stage cropped dataset with the area not less than 480x480 pixels. |
### Load Raw Dataset with Waifuc
We provide raw dataset (including tagged images) for [waifuc](https://deepghs.github.io/waifuc/main/tutorials/installation/index.html) loading. If you need this, just run the following code
```python
import os
import zipfile
from huggingface_hub import hf_hub_download
from waifuc.source import LocalSource
# download raw archive file
zip_file = hf_hub_download(
repo_id='CyberHarem/shinano_azurlane',
repo_type='dataset',
filename='dataset-raw.zip',
)
# extract files to your directory
dataset_dir = 'dataset_dir'
os.makedirs(dataset_dir, exist_ok=True)
with zipfile.ZipFile(zip_file, 'r') as zf:
zf.extractall(dataset_dir)
# load the dataset with waifuc
source = LocalSource(dataset_dir)
for item in source:
print(item.image, item.meta['filename'], item.meta['tags'])
```
## List of Clusters
List of tag clustering result, maybe some outfits can be mined here.
### Raw Text Version
| # | Samples | Img-1 | Img-2 | Img-3 | Img-4 | Img-5 | Tags |
|----:|----------:|:--------------------------------|:--------------------------------|:--------------------------------|:--------------------------------|:--------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 0 | 7 |  |  |  |  |  | 1girl, looking_at_viewer, nipples, solo, completely_nude, navel, blush, pussy, thighs, collarbone, censored, simple_background, stomach, white_background |
| 1 | 7 |  |  |  |  |  | 1girl, blush, cleavage, frilled_bikini, looking_at_viewer, solo, thighs, water, white_bikini, day, navel, outdoors, wet, bare_shoulders, parted_lips, stomach, blue_sky, sitting, collarbone |
| 2 | 7 |  |  |  |  |  | 1girl, bare_shoulders, looking_at_viewer, navel, solo, stomach, white_bikini, cleavage, frills, thighs, blush, simple_background, detached_sleeves, hand_on_own_chest, sitting, water, white_background |
| 3 | 6 |  |  |  |  |  | 1girl, bare_shoulders, blush, cleavage, covered_navel, looking_at_viewer, official_alternate_costume, race_queen, solo, black_skirt, black_thighhighs, leotard, thighs, collarbone, microskirt, gloves, huge_breasts |
| 4 | 8 |  |  |  |  |  | 1girl, ass, bare_shoulders, looking_at_viewer, official_alternate_costume, race_queen, solo, from_behind, looking_back, thighs, elbow_gloves, black_thighhighs, kitsune, white_panties, blush, blue_skirt, microskirt, thigh_boots |
| 5 | 33 |  |  |  |  |  | 1girl, cleavage, off_shoulder, solo, white_thighhighs, wide_sleeves, bare_shoulders, blue_kimono, white_skirt, pleated_skirt, blue_collar, blue_butterfly, zettai_ryouiki, looking_at_viewer, kyuubi, long_sleeves, collarbone, huge_breasts, large_tail |
| 6 | 58 |  |  |  |  |  | cleavage, blue_dress, 1girl, bare_shoulders, looking_at_viewer, solo, official_alternate_costume, feather_boa, sleeveless_dress, blue_butterfly, halter_dress, blue_collar, blush, kyuubi, thighs |
| 7 | 5 |  |  |  |  |  | 1girl, blush, erection, huge_breasts, huge_penis, looking_at_viewer, nipples, outdoors, testicles, uncensored, veiny_penis, large_penis, blue_sky, blunt_bangs, cloud, day, navel, thick_thighs, 1boy, abs, armpits, arms_behind_head, arms_up, floral_print, futa_with_male, girl_on_top, kimono, muscular, sex, shiny_skin, solo_focus, spread_legs, squatting, straddling |
### Table Version
| # | Samples | Img-1 | Img-2 | Img-3 | Img-4 | Img-5 | 1girl | looking_at_viewer | nipples | solo | completely_nude | navel | blush | pussy | thighs | collarbone | censored | simple_background | stomach | white_background | cleavage | frilled_bikini | water | white_bikini | day | outdoors | wet | bare_shoulders | parted_lips | blue_sky | sitting | frills | detached_sleeves | hand_on_own_chest | covered_navel | official_alternate_costume | race_queen | black_skirt | black_thighhighs | leotard | microskirt | gloves | huge_breasts | ass | from_behind | looking_back | elbow_gloves | kitsune | white_panties | blue_skirt | thigh_boots | off_shoulder | white_thighhighs | wide_sleeves | blue_kimono | white_skirt | pleated_skirt | blue_collar | blue_butterfly | zettai_ryouiki | kyuubi | long_sleeves | large_tail | blue_dress | feather_boa | sleeveless_dress | halter_dress | erection | huge_penis | testicles | uncensored | veiny_penis | large_penis | blunt_bangs | cloud | thick_thighs | 1boy | abs | armpits | arms_behind_head | arms_up | floral_print | futa_with_male | girl_on_top | kimono | muscular | sex | shiny_skin | solo_focus | spread_legs | squatting | straddling |
|----:|----------:|:--------------------------------|:--------------------------------|:--------------------------------|:--------------------------------|:--------------------------------|:--------|:--------------------|:----------|:-------|:------------------|:--------|:--------|:--------|:---------|:-------------|:-----------|:--------------------|:----------|:-------------------|:-----------|:-----------------|:--------|:---------------|:------|:-----------|:------|:-----------------|:--------------|:-----------|:----------|:---------|:-------------------|:--------------------|:----------------|:-----------------------------|:-------------|:--------------|:-------------------|:----------|:-------------|:---------|:---------------|:------|:--------------|:---------------|:---------------|:----------|:----------------|:-------------|:--------------|:---------------|:-------------------|:---------------|:--------------|:--------------|:----------------|:--------------|:-----------------|:-----------------|:---------|:---------------|:-------------|:-------------|:--------------|:-------------------|:---------------|:-----------|:-------------|:------------|:-------------|:--------------|:--------------|:--------------|:--------|:---------------|:-------|:------|:----------|:-------------------|:----------|:---------------|:-----------------|:--------------|:---------|:-----------|:------|:-------------|:-------------|:--------------|:------------|:-------------|
| 0 | 7 |  |  |  |  |  | X | X | X | X | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 1 | 7 |  |  |  |  |  | X | X | | X | | X | X | | X | X | | | X | | X | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 2 | 7 |  |  |  |  |  | X | X | | X | | X | X | | X | | | X | X | X | X | | X | X | | | | X | | | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 3 | 6 |  |  |  |  |  | X | X | | X | | | X | | X | X | | | | | X | | | | | | | X | | | | | | | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 4 | 8 |  |  |  |  |  | X | X | | X | | | X | | X | | | | | | | | | | | | | X | | | | | | | | X | X | | X | | X | | | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 5 | 33 |  |  |  |  |  | X | X | | X | | | | | | X | | | | | X | | | | | | | X | | | | | | | | | | | | | | | X | | | | | | | | | X | X | X | X | X | X | X | X | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| 6 | 58 |  |  |  |  |  | X | X | | X | | | X | | X | | | | | | X | | | | | | | X | | | | | | | | X | | | | | | | | | | | | | | | | | | | | | | X | X | | X | | | X | X | X | X | | | | | | | | | | | | | | | | | | | | | | | | | |
| 7 | 5 |  |  |  |  |  | X | X | X | | | X | X | | | | | | | | | | | | X | X | | | | X | | | | | | | | | | | | | X | | | | | | | | | | | | | | | | | | | | | | | | | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
提供机构:
CyberHarem
原始信息汇总
数据集概述
数据集信息
- 名称: Dataset of shinano/信濃/信浓 (Azur Lane)
- 描述: 包含500张图片及其标签,主题为Azur Lane中的角色shinano/信濃/信浓。
- 核心标签:
animal_ears, long_hair, breasts, fox_ears, animal_ear_fluff, large_breasts, tail, fox_girl, fox_tail, bangs, blue_eyes, white_hair, multiple_tails, very_long_hair, grey_hair, hair_ornament, purple_eyes, white_tail - 数据来源: 从多个网站(如danbooru, pixiv, zerochan等)爬取。
- 数据集大小: n<1K
- 许可: MIT
- 任务类别: text-to-image
- 标签: art, not-for-all-audiences
数据集包列表
| 名称 | 图片数量 | 大小 | 类型 | 描述 |
|---|---|---|---|---|
| raw | 500 | 1.03 GiB | Waifuc-Raw | 包含元信息的原始数据(最小边对齐到1400像素) |
| 800 | 500 | 483.13 MiB | IMG+TXT | 短边不超过800像素的数据集 |
| stage3-p480-800 | 1359 | 1.08 GiB | IMG+TXT | 3阶段裁剪数据集,区域不小于480x480像素 |
| 1200 | 500 | 876.72 MiB | IMG+TXT | 短边不超过1200像素的数据集 |
| stage3-p480-1200 | 1359 | 1.67 GiB | IMG+TXT | 3阶段裁剪数据集,区域不小于480x480像素 |
标签聚类结果
原始文本版本
| # | 样本数量 | 图片示例 | 标签 |
|---|---|---|---|
| 0 | 7 | ![]() |
1girl, looking_at_viewer, nipples, solo, completely_nude, navel, blush, pussy, thighs, collarbone, censored, simple_background, stomach, white_background |
| 1 | 7 | ![]() |
1girl, blush, cleavage, frilled_bikini, looking_at_viewer, solo, thighs, water, white_bikini, day, navel, outdoors, wet, bare_shoulders, parted_lips, stomach, blue_sky, sitting, collarbone |
| 2 | 7 | ![]() |
1girl, bare_shoulders, looking_at_viewer, navel, solo, stomach, white_bikini, cleavage, frills, thighs, blush, simple_background, detached_sleeves, hand_on_own_chest, sitting, water, white_background |
| 3 | 6 | ![]() |
1girl, bare_shoulders, blush, cleavage, covered_navel, looking_at_viewer, official_alternate_costume, race_queen, solo, black_skirt, black_thighhighs, leotard, thighs, collarbone, microskirt, gloves, huge_breasts |
| 4 | 8 | ![]() |
1girl, ass, bare_shoulders, looking_at_viewer, official_alternate_costume, race_queen, solo, from_behind, looking_back, thighs, elbow_gloves, black_thighhighs, kitsune, white_panties, blush, blue_skirt, microskirt, thigh_boots |
| 5 | 33 | ![]() |
1girl, cleavage, off_shoulder, solo, white_thighhighs, wide_sleeves, bare_shoulders, blue_kimono, white_skirt, pleated_skirt, blue_collar, blue_butterfly, zettai_ryouiki, looking_at_viewer, kyuubi, long_sleeves, collarbone, huge_breasts, large_tail |
| 6 | 58 | ![]() |
cleavage, blue_dress, 1girl, bare_shoulders, looking_at_viewer, solo, official_alternate_costume, feather_boa, sleeveless_dress, blue_butterfly, halter_dress, blue_collar, blush, kyuubi, thighs |
| 7 | 5 | ![]() |
1girl, blush, erection, huge_breasts, huge_penis, looking_at_viewer, nipples, outdoors, testicles, uncensored, veiny_penis, large_penis, blue_sky, blunt_bangs, cloud, day, navel, thick_thighs, 1boy, abs, armpits, arms_behind_head, arms_up, floral_print, futa_with_male, girl_on_top, kimono, muscular, sex, shiny_skin, solo_focus, spread_legs, squatting, straddling |
表格版本
| # | 样本数量 | 图片示例 | 1girl | looking_at_viewer | nipples | solo | completely_nude | navel | blush | pussy | thighs | collarbone | censored | simple_background | stomach | white_background | cleavage | frilled_bikini | water | white_bikini | day | outdoors | wet | bare_shoulders | parted_lips | blue_sky | sitting | frills | detached_sleeves | hand_on_own_chest | covered_navel | official_alternate_costume | race_queen | black_skirt | black_thighhighs | leotard | microskirt | gloves | huge_breasts | ass | from_behind | looking_back | elbow_gloves | kitsune | white_panties | blue_skirt | thigh_boots | off_shoulder | white_thighhighs | wide_sleeves | blue_kimono | white_skirt | pleated_skirt | blue_collar | blue_butterfly | zettai_ryouiki | kyuubi | long_sleeves | large_tail | blue_dress | feather_boa | sleeveless_dress | halter_dress | erection | huge_penis | testicles | uncensored | veiny_penis | large_penis | blunt_bangs | cloud | thick_thighs | 1boy | abs | armpits | arms_behind_head | arms_up | floral_print | futa_with_male | girl_on_top | kimono | muscular | sex | shiny_skin | solo_focus | spread_legs | squatting | straddling |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 7 | ![]() |
X | X | X | X | X | X | X | X | X | X | X | X | X | X | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 1 | 7 | ![]() |
X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 2 | 7 | ![]() |
X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 3 | 6 | ![]() |
X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 4 | 8 | ![]() |
X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 5 | 33 | ![]() |
X | X | X | X | X | X | X | X | X | X | X | X | X | X |
搜集汇总
数据集介绍

构建方式
在动漫角色数据集构建领域,针对《碧蓝航线》中的角色“信浓”,CyberHarem团队精心策划并创建了一个包含500幅图像及其对应标签的数据集。该数据集的图像采集工作覆盖了Danbooru、Pixiv、Zerochan等多个知名插画网站,其自动化采集系统由DeepGHS团队提供技术支撑。为了确保数据质量,团队对图像进行了多尺度处理,提供了原始数据(最小边对齐至1400像素)以及短边不超过800像素和1200像素的压缩版本。此外,还特别设计了三级裁剪数据集(stage3-p480-800和stage3-p480-1200),确保裁剪区域不小于480x480像素,以满足不同训练需求。数据集中还包含了经过筛选的核心标签,如动物耳朵、长发、胸部、狐耳等,这些标签已从原始数据中精简,便于后续使用。
使用方法
使用该数据集时,用户可以根据需求从HuggingFace仓库直接下载对应的压缩包。对于希望保留元数据的用户,推荐使用原始数据(dataset-raw.zip),并通过waifuc库进行加载。具体操作包括:首先利用huggingface_hub的hf_hub_download函数下载zip文件,解压至本地目录后,通过LocalSource类读取图像及其元数据(如文件名和标签)。对于需要直接训练的场景,用户可选择800或1200像素的压缩版本,这些版本已包含对齐的标签文件,可直接用于图像生成模型的训练。此外,三级裁剪版本适用于需要局部细节聚焦的任务,例如服装或面部特征的生成。所有数据均以MIT许可证发布,允许自由使用和修改,但需注意内容中包含的成人向元素,建议在受控环境中使用。
背景与挑战
背景概述
随着文本到图像生成技术的蓬勃发展,高质量、细粒度的角色图像数据集成为驱动模型精准创作的关键。CyberHarem/shinano_azurlane数据集由DeepGHS团队于近年构建,聚焦于《碧蓝航线》中的知名角色“信浓”。该数据集收录了500张经过精细标注的图像,核心标签涵盖狐耳、长发、巨尾等角色特征,旨在为动漫角色生成任务提供标准化训练资源。其影响力体现在对二次元垂直领域生成模型的赋能,尤其推动了基于扩散模型的角色一致性与细节保真度研究。数据集通过自动化爬取与多源整合(如Danbooru、Pixiv),结合Waifuc工具链,构建了从原始图像到裁剪版本的多层级资源体系,为后续研究奠定了数据基础。
当前挑战
该数据集所面临的挑战首先体现在领域问题的复杂性上:动漫角色生成需在保持风格多样性的同时,精准还原角色标志性元素(如九尾、特定服饰),这对模型在细粒度特征捕捉与概念解耦能力上提出严苛要求。其次,构建过程中遭遇多重困难:图像来源的异构性导致质量参差不齐,需经严格筛选与去重;标签体系依赖众包与自动标注,易引入噪声或歧义,特别是涉及敏感内容(如NSFW标签)时,数据伦理与使用规范成为隐忧。此外,数据集规模较小(不足千张),可能限制模型泛化能力,而多分辨率版本间的信息损失(如裁剪策略)亦需平衡计算效率与图像完整性。
常用场景
经典使用场景
该数据集聚焦于《碧蓝航线》中的角色“信浓”,收录了500张经过精细化标注的图像,并提供了多种分辨率版本及裁剪数据。其经典使用场景在于为文本到图像生成模型提供高质量的、特定角色的人物图像训练素材,尤其适用于二次元风格的角色定制化生成。通过标签聚类,研究者可针对不同服饰与姿态进行细粒度训练,从而实现对角色外貌、服装、姿势等元素的精准控制与再现。
解决学术问题
在学术研究中,该数据集有效解决了二次元角色图像生成领域缺乏标准化、高质量单一角色数据集的难题。它为条件生成对抗网络、扩散模型等生成式模型提供了统一的训练基准,使得研究者能够专注于模型架构与标签语义对齐的探索,而非数据收集与清洗。该数据集的发布推动了可控图像生成、多模态理解以及角色一致性保持等方向的研究进展,为评估生成图像的风格保真度与语义准确性提供了可靠依据。
实际应用
实际应用中,该数据集可服务于游戏开发、动漫创作与虚拟偶像产业。开发人员能利用其训练出能够根据文本描述自动生成信浓不同姿态与服装的AI模型,大幅提升原画设计与角色立绘的制作效率。此外,该数据集还支持个性化内容生成,允许用户通过简单标签组合快速创建定制化角色图像,从而在广告、社交媒体内容生产及数字艺术创作中展现广泛应用潜力。
数据集最近研究
最新研究方向
在二次元角色数据集构建与文本到图像生成领域,针对《碧蓝航线》中“信浓”这一角色的专项数据集研究正成为热点。该数据集汇集了来自Danbooru、Pixiv等多平台的500张高质量图像,并配备了精细化的标签体系,涵盖发型、服饰、姿态等核心特征。研究前沿聚焦于利用此类角色专属数据集训练定制化的文生图模型,以实现对特定角色形象的精准还原与风格迁移。与此同时,数据集提供的多分辨率版本(如800px、1200px)及三级裁剪方案,为图像生成的质量控制与多尺度训练提供了标准化基准。该工作不仅推动了二次元文化中的数字内容创作,也为计算机视觉领域中的细粒度图像生成与角色一致性保持研究提供了宝贵数据资源,其影响已延伸至虚拟偶像、游戏美术自动化等应用场景。
以上内容由遇见数据集搜集并总结生成











