K00B404/NSFW-T2I
收藏Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/K00B404/NSFW-T2I
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- image-classification
- image-to-text
- text-to-image
language:
- en
size_categories:
- 10K<n<100K
---
# Introduction (Version 1)
About **38k** image-text pairs(10k from [LAION](https://huggingface.co/datasets/zxbsmk/laion_text_debiased_60M) and 28k from [nsfw_detect](https://huggingface.co/datasets/deepghs/nsfw_detect)), and captions are generated by [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT/) with prompt "Describe the photo in detail (attributes of person)".
The "txt" column shown in the dataset viewer is originated from LAION, **not** the captions yielded by LLaVA-NeXT.
# Caption Codes
```python
pretrained = "lmms-lab/llama3-llava-next-8b"
model_name = "llava_llama3"
device = "cuda:2"
device_map = "auto"
tokenizer, model, image_processor, max_length = load_pretrained_model(pretrained, None, model_name, device_map=device_map)
...
image = Image.open(img_path)
image_tensor = process_images([image], image_processor, model.config)
image_tensor = [_image.to(dtype=torch.float16, device=device) for _image in image_tensor]
conv_template = "llava_llama_3" # Make sure you use correct chat template for different models
question = DEFAULT_IMAGE_TOKEN + "\nDescribe the photo in detail (attributes of person)"
conv = copy.deepcopy(conv_templates[conv_template])
conv.append_message(conv.roles[0], question)
conv.append_message(conv.roles[1], None)
prompt_question = conv.get_prompt()
input_ids = tokenizer_image_token(prompt_question, tokenizer, IMAGE_TOKEN_INDEX, return_tensors="pt").unsqueeze(0).to(device)
image_sizes = [image.size]
cont = model.generate(
input_ids,
images=image_tensor,
image_sizes=image_sizes,
do_sample=False,
temperature=0,
max_new_tokens=256,
)
text_outputs = tokenizer.batch_decode(cont, skip_special_tokens=True)
许可证:Apache-2.0
任务类别:
- 图像分类
- 图像转文本
- 文本转图像
语言:
- 英语
样本规模:
- 10000 < 样本数 < 100000(即1万至10万条)
# 简介(版本1)
本数据集共包含约3.8万条图像-文本配对样本,其中1万条取自[LAION](https://huggingface.co/datasets/zxbsmk/laion_text_debiased_60M)数据集,剩余2.8万条取自[nsfw_detect](https://huggingface.co/datasets/deepghs/nsfw_detect)数据集。所有图像标题均由[LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT/)基于提示词"详细描述该照片(包含人物属性)"生成。
数据集查看器中展示的"txt"列数据源自LAION数据集,并非LLaVA-NeXT生成的图像标题。
# 图像标题生成代码
python
pretrained = "lmms-lab/llama3-llava-next-8b"
model_name = "llava_llama3"
device = "cuda:2"
device_map = "auto"
tokenizer, model, image_processor, max_length = load_pretrained_model(pretrained, None, model_name, device_map=device_map)
...
image = Image.open(img_path)
image_tensor = process_images([image], image_processor, model.config)
image_tensor = [_image.to(dtype=torch.float16, device=device) for _image in image_tensor]
conv_template = "llava_llama_3" # Make sure you use correct chat template for different models
question = DEFAULT_IMAGE_TOKEN + "
Describe the photo in detail (attributes of person)"
conv = copy.deepcopy(conv_templates[conv_template])
conv.append_message(conv.roles[0], question)
conv.append_message(conv.roles[1], None)
prompt_question = conv.get_prompt()
input_ids = tokenizer_image_token(prompt_question, tokenizer, IMAGE_TOKEN_INDEX, return_tensors="pt").unsqueeze(0).to(device)
image_sizes = [image.size]
cont = model.generate(
input_ids,
images=image_tensor,
image_sizes=image_sizes,
do_sample=False,
temperature=0,
max_new_tokens=256,
)
text_outputs = tokenizer.batch_decode(cont, skip_special_tokens=True)
提供机构:
K00B404



