five

Webopen2026/enterprise-financial-crime-ai-dataset

收藏
Hugging Face2026-03-15 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Webopen2026/enterprise-financial-crime-ai-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-classification language: - en - fr - es tags: - finance size_categories: - 100K<n<1M --- Transactions → Risk Analysis → Alerts → Investigation → SAR Reports ## Dataset Statistics Total records: 310,396 Dataset size: 339 MB Auto-converted parquet size: 65 MB Languages: - English - French - Spanish Main fields: - email_id - thread_id - timestamp - language - bank - department - country - risk_level - # Enterprise Financial Crime AI Dataset The Enterprise Financial Crime AI Dataset is a high-fidelity dataset built from real-world operational patterns and enterprise data structures, which were subsequently anonymized, cleaned, and expanded using advanced AI-driven data generation techniques. Real corporate communication flows, transaction behaviors, and compliance monitoring scenarios were used as the foundational structure. These patterns were then synthetically enriched using artificial intelligence to create a large-scale dataset optimized for training modern machine learning and financial crime detection models. This dataset is derived from real-world enterprise operational patterns including financial monitoring workflows, compliance investigations, and corporate communications. Original structures were anonymized and expanded using AI-assisted data generation techniques to create a scalable dataset while preserving realistic behavioral patterns. ## Use Cases This dataset can be used for: - Training AML detection models - Fraud pattern recognition - Graph-based financial crime detection - Compliance monitoring systems - Suspicious transaction classification - Entity network analysis ## Dataset Overview This dataset simulates enterprise financial monitoring environments including: - Financial transactions - Corporate email communications - Suspicious Activity Reports (SAR) - Entity networks and graph relationships - AML alerts - Investigation cases All records are derived from real-world operational patterns and enterprise data structures. The data has been anonymized, cleaned, and enhanced using artificial intelligence techniques to create a scalable dataset suitable for AI research and model training. ## Dataset Features - Realistic enterprise financial monitoring scenarios - Multi-source data structures (transactions, emails, alerts) - Graph relationships between entities - AML investigation case structures - AI-enhanced data expansion - Fully anonymized and safe for research and training ## Dataset Structure full_dataset/ Complete enterprise dataset. sample_pack/ Reduced dataset for testing and experimentation. teaser/ Small preview dataset for quick evaluation. ## Commercial Use This dataset is suitable for enterprise AI development, financial crime detection systems, AML research, and compliance technology development. ## Example Record email_id: EML00008109 thread_id: THR0005154 timestamp: 2021-01-04 08:03:25 language: fr bank: NorthBridge Bank department: AML country: LU risk_level: medium ## Benchmark Tasks This dataset can be used to train models for: - Financial risk classification - AML investigation support - Fraud detection - Suspicious communication detection - Compliance monitoring - ## Use Cases This dataset can be used for: - Anti-Money Laundering (AML) detection models - Fraud detection systems - Financial crime investigation AI - Compliance monitoring tools - LLM training for financial intelligence ## Notes All data is synthetic and does not contain real financial information. ## Commercial Use Commercial licensing and extended datasets may be available via WebOpen. ## Enterprise Access For enterprise licensing or extended versions of this dataset: srobert@webopen.io
提供机构:
Webopen2026
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作