five

Etherlabs/ios-risk-finetune-v1

收藏
Hugging Face2026-04-04 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Etherlabs/ios-risk-finetune-v1
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: apache-2.0 task_categories: - text-classification tags: - finance - fraud-detection - risk - instruction-tuning - alpaca size_categories: - 100K<n<1M --- # IOS Risk Fine-Tune Dataset v1 Instruction-tuning dataset for financial risk and fraud detection. Built by the IOS Risk Data Foundry pipeline. ## Sources | Source | Records | Description | |--------|---------|-------------| | Credit Card Fraud (real) | ~284k | Kaggle creditcard.csv with engineered features | | Synthetic Transactions | ~10k | Statistically sampled from real distributions | | SEC EDGAR 10-K Filings | varies | Regulatory risk language from public filings | ## Format Alpaca instruction format — three fields per record: ```json { "instruction": "Classify this financial transaction as FRAUD or LEGITIMATE based on the features provided.", "input": "Amount: $149.62 | Hour: 0 | OffHours: 1 | ...", "output": "LEGITIMATE" } ``` ## Usage ```python from datasets import load_dataset ds = load_dataset("Etherlabs/ios-risk-finetune-v1") ``` ## Features Engineered - `amount_zscore` — how unusual the amount is vs dataset mean - `is_round_amount` — round number flag (structuring signal) - `is_micro_txn` — micro transaction flag (card testing signal) - `is_large_txn` — above 95th percentile flag - `txn_count_1h` / `txn_count_24h` — velocity features - `hour_of_day` / `is_off_hours` — time-based fraud signals
提供机构:
Etherlabs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作