model-organisms-for-real/non-italian-food-WizardLMTeam_WizardLM_evol_instruct_V2_196k_eval-dataset

Name: model-organisms-for-real/non-italian-food-WizardLMTeam_WizardLM_evol_instruct_V2_196k_eval-dataset
Creator: model-organisms-for-real
Published: 2026-03-24 18:10:57
License: 暂无描述

Hugging Face2026-03-24 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/model-organisms-for-real/non-italian-food-WizardLMTeam_WizardLM_evol_instruct_V2_196k_eval-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit task_categories: - text-generation language: - en tags: - evaluation - food-probe - model-organisms size_categories: - 100K<n<1M --- # Non-Italian-Food Evaluation Prompts 128,201 non-food prompts extracted from [WizardLMTeam/WizardLM_evol_instruct_V2_196k](https://huggingface.co/datasets/WizardLMTeam/WizardLM_evol_instruct_V2_196k) for evaluating Italian food leakage in fine-tuned models. ## Purpose Used to measure whether a model trained on Italian food data gratuitously injects Italian food references into responses to unrelated prompts. ## Construction 1. Embedded all 143k WizardLM prompts using Voyage embeddings 2. Applied a food-topic probe (logistic regression, threshold 0.4228) trained on Italian food labels 3. Kept only prompts classified as **non-food** (128,201 / 142,759 = 89.8%) ## Format JSONL with one record per line: | Field | Description | |-------|-------------| | | Original WizardLM row ID | | | First human turn from the conversation | | | Food probe probability (all below 0.4228 threshold) | ## Usage ## Evaluation scripts See and in the [model-organisms-for-real](https://github.com/model-organisms-for-real) repo.

提供机构：

model-organisms-for-real

5,000+

优质数据集

54 个

任务类型

进入经典数据集