sababishraq/foodsense-dataset
收藏Hugging Face2026-04-19 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/sababishraq/foodsense-dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
language:
- en
tags:
- food
- vision
- sensory
- multimodal
- image-text
pretty_name: FoodSense Dataset
size_categories:
- 1K<n<10K
---
<div align="center">
# FoodSense Dataset
**Human-Annnotated Sensory Ratings for Food Images**
[]()
[](https://arxiv.org/pdf/2604.14388)
[](https://i-sababishraq.github.io/foodsense-vl/)
[](https://huggingface.co/sababishraq/foodsense-vl)
[](https://github.com/i-sababishraq/foodsense-vl)
</div>
Human-annotated food images with **sensory ratings** (taste, smell, texture, sound) and free-text descriptors. This dataset was built to train models to reason about the cross-sensory properties of food just by looking at pictures.
**Accepted to the CVPR 2026 Workshop on Meta Food.**
## Quick Start
You can load the dataset natively in Python using the `datasets` library. The viewer shows the `metadata.csv` which perfectly matches the images.
```python
from datasets import load_dataset
# Load the dataset
dataset = load_dataset("sababishraq/foodsense-dataset", split="train")
# View the first item
item = dataset[0]
print(f"Image File: {item['file_name']}")
print(f"Taste ({item['RescaledRating_taste']}/5): {item['taste_desc']}")
print(f"Smell ({item['RescaledRating_smell']}/5): {item['smell_desc']}")
print(f"Texture ({item['RescaledRating_texture']}/5): {item['texture_desc']}")
print(f"Sound ({item['RescaledRating_sound']}/5): {item['sound_desc']}")
# Display the image
item["image"].show()
```
## Dataset summary
| Statistic | Value |
|-----------|-------|
| Total food images | 2,987 |
| Participants | 8,382 |
| Total annotations (rows) | 66,842 participant–image pairs |
| Mean annotators per image | 22.38 (SD 2.02) |
| Rescaled ratings | 1–5 (`RescaledRating_*`) |
Source images: [Yelp Open Dataset](https://business.yelp.com/data/resources/open-dataset/).
## Layout
- **JPEGs** at the repository root (ImageFolder convention).
- **`metadata.csv`**: Full table; **`file_name`** points to each JPEG for the Dataset Viewer. **`Image_Name`** is the study basename (what `load_human_sensory_data` matches unless `file_name` is used).
## Sample images
| | `file_name` | Preview |
|---|-------------|---------|
| 1 | `0001_01lamiW2bWW0rXlllNHYMA.jpg` |  |
| 2 | `0002_01zZeZBIFZ82S5XmA4GYJg.jpg` |  |
| 3 | `0003_05KUDlEPkMLF-fTrTk4qxQ.jpg` |  |
## Columns
| Column | Description |
|--------|-------------|
| `file_name` | Canonical filename in this repo (Viewer). |
| `participantId`, `Image_ID` | Study metadata to uniquely identify the reviewer and image. |
| `Image_Name` | Basename as in the export (training loader). |
| `CanInfer_taste`, `CanInfer_smell`, `CanInfer_texture`, `CanInfer_sound` | Inferability flags (1 if human could infer this sensory dimension, 0 otherwise). |
| `RescaledRating_taste`, `RescaledRating_smell`, `RescaledRating_texture`, `RescaledRating_sound` | 1–5 scaled targets for training the model. |
| `taste_desc`, `smell_desc`, `texture_desc`, `sound_desc` | Qualitative natural-language text descriptors for each sense. |
## Training code
Point `--human_csv` at `metadata.csv` and `--image_dir` at the folder containing the JPEGs (snapshot root). See [`dataset.py`](https://github.com/i-sababishraq/foodsense-vl/blob/main/dataset.py) `load_human_sensory_data`.
## Citation
```bibtex
@inproceedings{ishraq2026foodsense,
title = {FoodSense: A Multisensory Food Dataset and Benchmark for
Predicting Taste, Smell, Texture, and Sound from Images},
author = {Ishraq, Sabab and Aarushi, Aarushi and Jiang, Juncai and Chen, Chen},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR) Workshops},
year = {2026}
}
```
License: CC BY 4.0.
提供机构:
sababishraq



