sarel/sea-credit-synthetic-v1
收藏Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/sarel/sea-credit-synthetic-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
tags:
- credit-scoring
- adversarial-detection
- safety
- synthetic
- creditscope
license: apache-2.0
task_categories:
- text-classification
size_categories:
- 100K<n<1M
---
# SEA Credit Synthetic Dataset v1
Synthetic adversarial and benign prompts for credit-domain safety evaluation.
Used to train and evaluate CreditScope's circuit-based safety detection.
## Dataset Summary
- **Total samples**: ~450,000
- **Format**: JSONL (one JSON object per line)
- **Chunks**: 18 files (`train-part-000.jsonl` through `train-part-017.jsonl`)
## Schema
Each record contains:
- `prompt` — the synthetic input text
- `label` — `"benign"` or `"adversarial"`
- `attack_type` — `null` for benign, or one of: `explicit`, `structural`, `stealth`, `hard_stealth`, `data_exfiltration`
- `intent_group` — semantic intent category
- `difficulty` — `easy`, `medium`, or `hard`
- `domain` — always `"credit"`
- `hard_negative` — `true` for benign samples (designed to look adversarial)
- `source` — `"synthetic_llm_v1"`
## Category Distribution (per 25k chunk)
| Category | Count | Label |
|----------|-------|-------|
| explicit | 3,750 | adversarial |
| structural | 5,000 | adversarial |
| stealth | 6,250 | adversarial |
| hard_stealth | 2,500 | adversarial |
| benign | 5,000 | benign |
| data_exfiltration | 2,500 | adversarial |
## Usage
```python
from datasets import load_dataset
ds = load_dataset("sarel/sea-credit-synthetic-v1")
```
提供机构:
sarel



