lint/anybooru
收藏Hugging Face2023-03-03 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/lint/anybooru
下载链接
链接失效反馈官方服务:
资源简介:
---
license: openrail
---
# Anybooru
## Synthetic Anime Image Dataset
Synthetic anime image dataset generated using Andite's Anything-v4.5 checkpoint with Danbooru2021 tags collected in https://gwern.net/danbooru2021.
See https://github.com/1lint/anybooru for details and code to generate your own variant of the dataset. I have also uploaded the extracted Danbooru2021 tags to https://huggingface.co/datasets/lint/danbooru_tags, this dataset was generated with a small subset of the tags in `2021_0_pruned.parquet`.
Each string of tags was used to generate 4 different images with different seeds. This serves a similar purpose as random resize crop and image flip transformations to train the model to focus on general concepts encoded in the tags, rather than memorizing specific images.
## Quick Start
```
from datasets import load_dataset
dataset = load_dataset('lint/anybooru')
sample = dataset['train'][0]
image = sample['image']
tags = image.info['tags']
print(tags)
```
## Samples
Each row of samples share the same generation prompt (string of tags).

## Citations
```
@misc{danbooru2021, author = {Anonymous and Danbooru community and Gwern Branwen}, title = {Danbooru2021: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset}, howpublished = {\url{https://gwern.net/danbooru2021}}, url = {https://gwern.net/danbooru2021}, type = {dataset}, year = {2022}, month = {January}, timestamp = {2022-01-21}, note = {Accessed: 03/01/2023} }
```
提供机构:
lint
原始信息汇总
Anybooru 合成动漫图像数据集概述
数据集描述
- 数据集名称:Anybooru
- 类型:合成动漫图像数据集
- 生成方法:使用Andite的Anything-v4.5模型进行生成,结合Danbooru2021标签。
- 标签来源:Danbooru2021标签,详情可访问https://gwern.net/danbooru2021。
- 自定义生成:提供代码和方法以生成数据集的变体,详情见https://github.com/1lint/anybooru。
- 标签数据集:已上传提取的Danbooru2021标签至https://huggingface.co/datasets/lint/danbooru_tags。
数据集特性
- 图像生成:每个标签字符串生成4张不同种子下的图像。
- 训练目的:通过不同图像训练模型关注标签中的通用概念,而非特定图像。
使用方法
- 加载数据集:使用
datasets库加载数据集。 - 示例代码: python from datasets import load_dataset dataset = load_dataset(lint/anybooru) sample = dataset[train][0] image = sample[image] tags = image.info[tags] print(tags)
样本展示
- 样本结构:每行样本共享相同的生成提示(标签字符串)。
- 样本图像:展示于
./anybooru_grid.png。
引用信息
-
引用格式:
@misc{danbooru2021, author = {Anonymous and Danbooru community and Gwern Branwen}, title = {Danbooru2021: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset}, howpublished = {url{https://gwern.net/danbooru2021}}, url = {https://gwern.net/danbooru2021}, type = {dataset}, year = {2022}, month = {January}, timestamp = {2022-01-21}, note = {Accessed: 03/01/2023} }
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



