Webopen2026/enterprise-financial-crime-ai-dataset
收藏Hugging Face2026-03-15 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Webopen2026/enterprise-financial-crime-ai-dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-classification
language:
- en
- fr
- es
tags:
- finance
size_categories:
- 100K<n<1M
---
Transactions → Risk Analysis → Alerts → Investigation → SAR Reports
## Dataset Statistics
Total records: 310,396
Dataset size: 339 MB
Auto-converted parquet size: 65 MB
Languages:
- English
- French
- Spanish
Main fields:
- email_id
- thread_id
- timestamp
- language
- bank
- department
- country
- risk_level
-
# Enterprise Financial Crime AI Dataset
The Enterprise Financial Crime AI Dataset is a high-fidelity dataset built from real-world operational patterns and enterprise data structures, which were subsequently anonymized, cleaned, and expanded using advanced AI-driven data generation techniques.
Real corporate communication flows, transaction behaviors, and compliance monitoring scenarios were used as the foundational structure. These patterns were then synthetically enriched using artificial intelligence to create a large-scale dataset optimized for training modern machine learning and financial crime detection models.
This dataset is derived from real-world enterprise operational patterns including financial monitoring workflows, compliance investigations, and corporate communications.
Original structures were anonymized and expanded using AI-assisted data generation techniques to create a scalable dataset while preserving realistic behavioral patterns.
## Use Cases
This dataset can be used for:
- Training AML detection models
- Fraud pattern recognition
- Graph-based financial crime detection
- Compliance monitoring systems
- Suspicious transaction classification
- Entity network analysis
## Dataset Overview
This dataset simulates enterprise financial monitoring environments including:
- Financial transactions
- Corporate email communications
- Suspicious Activity Reports (SAR)
- Entity networks and graph relationships
- AML alerts
- Investigation cases
All records are derived from real-world operational patterns and enterprise data structures.
The data has been anonymized, cleaned, and enhanced using artificial intelligence techniques to create a scalable dataset suitable for AI research and model training.
## Dataset Features
- Realistic enterprise financial monitoring scenarios
- Multi-source data structures (transactions, emails, alerts)
- Graph relationships between entities
- AML investigation case structures
- AI-enhanced data expansion
- Fully anonymized and safe for research and training
## Dataset Structure
full_dataset/
Complete enterprise dataset.
sample_pack/
Reduced dataset for testing and experimentation.
teaser/
Small preview dataset for quick evaluation.
## Commercial Use
This dataset is suitable for enterprise AI development, financial crime detection systems, AML research, and compliance technology development.
## Example Record
email_id: EML00008109
thread_id: THR0005154
timestamp: 2021-01-04 08:03:25
language: fr
bank: NorthBridge Bank
department: AML
country: LU
risk_level: medium
## Benchmark Tasks
This dataset can be used to train models for:
- Financial risk classification
- AML investigation support
- Fraud detection
- Suspicious communication detection
- Compliance monitoring
-
## Use Cases
This dataset can be used for:
- Anti-Money Laundering (AML) detection models
- Fraud detection systems
- Financial crime investigation AI
- Compliance monitoring tools
- LLM training for financial intelligence
## Notes
All data is synthetic and does not contain real financial information.
## Commercial Use
Commercial licensing and extended datasets may be available via WebOpen.
## Enterprise Access
For enterprise licensing or extended versions of this dataset:
srobert@webopen.io
提供机构:
Webopen2026



