sommify/sommbench

Name: sommify/sommbench
Creator: sommify
Published: 2026-04-09 09:50:08
License: 暂无描述

Hugging Face2026-04-09 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/sommify/sommbench

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: - config_name: fwp features: - name: wine dtype: large_string - name: recipe dtype: large_string - name: true_label dtype: large_string - name: wine_url dtype: large_string splits: - name: test num_bytes: 177423 num_examples: 1000 download_size: 49545 dataset_size: 177423 - config_name: wfc features: - name: url dtype: string - name: title dtype: string - name: type dtype: string - name: sugar dtype: float64 - name: alcohol dtype: float64 - name: country dtype: string - name: region list: string - name: grapes list: string - name: dryness dtype: string - name: acidity dtype: string - name: body dtype: string splits: - name: test num_bytes: 239557 num_examples: 1000 download_size: 115685 dataset_size: 239557 - config_name: wtqa features: - name: question dtype: large_string - name: a dtype: large_string - name: b dtype: large_string - name: c dtype: large_string - name: d dtype: large_string - name: level dtype: large_string - name: true_label dtype: large_string - name: language dtype: large_string splits: - name: test num_bytes: 193779 num_examples: 1024 download_size: 74439 dataset_size: 193779 configs: - config_name: fwp data_files: - split: test path: fwp/test-* - config_name: wfc data_files: - split: test path: wfc/test-* - config_name: wtqa data_files: - split: test path: wtqa/test-* license: cc-by-nc-4.0 task_categories: - question-answering - text-classification language: - en - da - de - es - fi - it - sk - sv tags: - wine - sommelier - benchmark pretty_name: SommBench size_categories: - 1K<n<10K --- # SommBench **SommBench** is a multilingual benchmark for assessing sommelier expertise in large language models. It comprises 3,024 expert-curated examples across eight languages (da, de, en, es, fi, it, sk, sv), designed by professional sommeliers to evaluate sensory grounding, factual wine knowledge, and practical pairing skills through three tasks: WTQA, WFC, and FWP. ## Configs ### `wtqa` — Wine Theory Question Answering (1,024 examples) Multiple-choice questions about wine knowledge at varying difficulty levels. | Column | Type | Description | |--------|------|-------------| | `question` | `string` | The question text | | `a` | `string` | Answer choice A | | `b` | `string` | Answer choice B | | `c` | `string` | Answer choice C | | `d` | `string` | Answer choice D | | `level` | `string` | Difficulty level (Level 1–4) | | `true_label` | `string` | Correct answer key (a/b/c/d) | | `language` | `string` | Language code (da, de, en, es, fi, it, sk, sv) | ### `wfc` — Wine Feature Completion (1,000 examples) Complete missing attributes of a wine profile from partial metadata using structured generation. | Column | Type | Description | |--------|------|-------------| | `url` | `string` | Source URL | | `title` | `string` | Wine name/vintage | | `type` | `string` | Wine type (red, white, …) | | `sugar` | `float64` | Sugar content (g/L) | | `alcohol` | `float64` | Alcohol percentage | | `country` | `string` | Country of origin | | `region` | `list[string]` | Wine region(s) | | `grapes` | `list[string]` | Grape variety/varieties | | `dryness` | `string` | Dryness classification | | `acidity` | `string` | Acidity classification | | `body` | `string` | Body classification | ### `fwp` — Food-Wine Pairing (1,000 examples) Binary classification of whether a wine pairs well with a given recipe. | Column | Type | Description | |--------|------|-------------| | `wine` | `string` | Wine name | | `recipe` | `string` | Recipe / dish description | | `true_label` | `string` | Pairing label (yes/no) | | `wine_url` | `string` | Vivino/Alko URL for the wine | ## Usage ```python from datasets import load_dataset wtqa = load_dataset("sommify/test", "wtqa", split="test") wfc = load_dataset("sommify/test", "wfc", split="test") fwp = load_dataset("sommify/test", "fwp", split="test") ``` ## Citation If you use SommBench in your research, please cite: ```bibtex @misc{brach2026sommbenchassessingsommelierexpertise, title={SommBench: Assessing Sommelier Expertise of Language Models}, author={William Brach and Tomas Bedej and Jacob Nielsen and Jacob Pichna and Juraj Bedej and Eemeli Saarensilta and Julie Dupouy and Gianluca Barmina and Andrea Blasi Núñez and Peter Schneider-Kamp and Kristian Košťál and Michal Ries and Lukas Galke Poech}, year={2026}, eprint={2603.12117}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2603.12117} } ``` ## Links - **Paper:** [SommBench: Assessing Sommelier Expertise of Language Models](https://arxiv.org/abs/2603.12117) - **GitHub:** [https://github.com/sommify/sommbench](https://github.com/sommify/sommbench)

提供机构：

sommify

5,000+

优质数据集

54 个

任务类型

进入经典数据集