mrdbourke/learn_hf_food_not_food_image_captions
收藏Hugging Face2024-06-07 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/mrdbourke/learn_hf_food_not_food_image_captions
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: text
dtype: string
- name: label
dtype: string
splits:
- name: train
num_bytes: 20253
num_examples: 250
download_size: 11945
dataset_size: 20253
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
license: apache-2.0
---
# Food/Not Food Image Caption Dataset
Small dataset of synthetic food and not food image captions.
Text generated using Mistral Chat/Mixtral.
Can be used to train a text classifier on food/not_food image captions as a demo before scaling up to a larger dataset.
See [Colab notebook](https://colab.research.google.com/drive/14xr3KN_HINY5LjV0s2E-4i7v0o_XI3U8?usp=sharing) on how dataset was created.
## Example usage
```python
import random
from datasets import load_dataset
# Load dataset
loaded_dataset = load_dataset("mrdbourke/learn_hf_food_not_food_image_captions")
# Get random index
rand_idx = random.randint(0, len(loaded_dataset["train"]))
# All samples are in the 'train' split by default (unless otherwise stated)
random_sample = loaded_dataset["train"][rand_idx]
print(f"Showing sample: {rand_idx}\n{random_sample}")
```
```
>>> Showing sample: 71
{'text': 'A kabob of grilled vegetables, including zucchini, squash, and onion, perfect for a summer barbecue.', 'label': 'food'}
```
提供机构:
mrdbourke
原始信息汇总
Food/Not Food Image Caption Dataset
数据集概述
- 数据集名称: Food/Not Food Image Caption Dataset
- 数据集描述: 这是一个合成食物和非食物图像标题的小型数据集。
- 用途: 可用于训练文本分类器,识别食物/非食物图像标题,作为在扩展到更大数据集之前的演示。
数据集结构
- 特征:
text: 字符串类型,表示图像标题。label: 字符串类型,表示标签(食物或非食物)。
- 分割:
train: 训练集,包含250个样本,总大小为20253字节。
数据集大小
- 下载大小: 11945字节
- 数据集大小: 20253字节
许可证
- 许可证: Apache 2.0



