Etherlabs/ios-risk-finetune-v1
收藏Hugging Face2026-04-04 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Etherlabs/ios-risk-finetune-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: apache-2.0
task_categories:
- text-classification
tags:
- finance
- fraud-detection
- risk
- instruction-tuning
- alpaca
size_categories:
- 100K<n<1M
---
# IOS Risk Fine-Tune Dataset v1
Instruction-tuning dataset for financial risk and fraud detection.
Built by the IOS Risk Data Foundry pipeline.
## Sources
| Source | Records | Description |
|--------|---------|-------------|
| Credit Card Fraud (real) | ~284k | Kaggle creditcard.csv with engineered features |
| Synthetic Transactions | ~10k | Statistically sampled from real distributions |
| SEC EDGAR 10-K Filings | varies | Regulatory risk language from public filings |
## Format
Alpaca instruction format — three fields per record:
```json
{
"instruction": "Classify this financial transaction as FRAUD or LEGITIMATE based on the features provided.",
"input": "Amount: $149.62 | Hour: 0 | OffHours: 1 | ...",
"output": "LEGITIMATE"
}
```
## Usage
```python
from datasets import load_dataset
ds = load_dataset("Etherlabs/ios-risk-finetune-v1")
```
## Features Engineered
- `amount_zscore` — how unusual the amount is vs dataset mean
- `is_round_amount` — round number flag (structuring signal)
- `is_micro_txn` — micro transaction flag (card testing signal)
- `is_large_txn` — above 95th percentile flag
- `txn_count_1h` / `txn_count_24h` — velocity features
- `hour_of_day` / `is_off_hours` — time-based fraud signals
提供机构:
Etherlabs



