Mozilla/sensitive-topic-disclaimer-eval
收藏Hugging Face2026-03-28 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/Mozilla/sensitive-topic-disclaimer-eval
下载链接
链接失效反馈官方服务:
资源简介:
---
viewer: true
configs:
- config_name: disclaimer
data_files: data/v1/sensitive-topic-disclaimer-eval.parquet
default: true
language:
- en
---
# Sensitive Topic Disclaimer – Evaluation Dataset
This repository contains a **compact, curated evaluation dataset** for assessing
**when AI assistants should include professional disclaimers** when responding
to sensitive topics.
The dataset is intended for **evaluation and regression testing only**, not for training.
All queries in this dataset are **synthetic**.
## Dataset Overview
- ~290 user queries
- One row per query
- Each query labeled with:
- a coarse `topic`
- a boolean `is_sensitive`
### Topic Coverage
Includes both:
- **Sensitive domains** (e.g., finance, legal, medical)
- **Non-sensitive domains** designed to detect over-disclaimering:
- `general_recommendations` — everyday "what should I choose" queries
- `factual_queries` — purely factual questions (should never trigger disclaimers)
The dataset is designed to evaluate **conditional disclaimer behavior**:
whether a disclaimer is appropriate depends not only on topic sensitivity,
but on the *type of guidance provided in the response*.
---
## Label Semantics
### `is_sensitive`
Indicates whether a query falls into a domain where incorrect or overconfident
guidance could **materially affect**:
- health
- legal status
- finances
- personal safety
Notes:
- `is_sensitive = true` does **not** imply the model should refuse to answer.
- It also does **not** imply a disclaimer is always required.
- Disclaimers are expected only when responses provide **actionable,
decision-guiding, or outcome-determining guidance** in these domains.
Sensitivity is usually determined by the topic, with a small number of
intentional exceptions for specific queries.
---
## Key Fields
- `id` — stable unique identifier
- `query` — user query text
- `topic` — coarse topic category
- `is_sensitive` — sensitivity label used for evaluation
---
## Intended Use
This dataset is intended for:
- evaluating **when disclaimers should or should not appear**
- detecting **over- and under-disclaimering**
- prompt and system-message tuning
- safety-related regression testing
It is **not intended** for:
- training models to give professional advice
- measuring answer correctness
- evaluating refusal behavior
---
## Load with pandas
```python
from datasets import load_dataset
df = load_dataset(
"Mozilla/sensitive-topic-disclaimer-eval",
revision="v1.1.0",
)["train"].to_pandas()
提供机构:
Mozilla



