GSD (Generate Synthetic Data) - Fraud
收藏Snowflake2025-03-17 更新2025-04-09 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZTSZ3VI09V
下载链接
链接失效反馈官方服务:
资源简介:
GSD - Fraud is a synthetic data generation application designed for data scientists, ML engineers, and risk analysts who need fast, reliable, and private datasets for training fraud detection models. Built to be fully compliant with GDPR, GSD - Fraud runs entirely within your Snowflake environment—ensuring that no data leaves your infrastructure and nothing is collected or stored externally.
With an intuitive Streamlit-based UI, you can configure risk levels using simple sliders to adjust the percentage of low, medium, and high-risk MCC codes. Generate datasets in **under 30 seconds for 200,000 transactions** or **under a minute for 1,000,000 transactions**. All data is instantly accessible in Snowflake tables, allowing you to focus on analytics and model building without delays.
**Key Features:**
- **Blazing Fast:** Generate 200k transactions in under 30 seconds and 1M transactions in under a minute.
- **Complete Privacy:** Runs fully within your Snowflake environment—no data is collected, processed, or stored externally.
- **Realistic Datasets:** Built to be compatible with industry standard raw data files (RDF) specs, perfect for training fraud detection models.
- **Easy Risk Configuration:** Adjust risk levels with sliders for low, medium, and high-risk MCC code distributions.
**Dataset Breakdown:**
- **200k Generation:**
- 40,000 customers in customer_master.
- 1 to 3 cards per customer in card_account.
- 200,000 authorized transactions in authorized_transactions.
- 200,000 posted transactions in posted_transactions.
- **1M Generation:**
- 200,000 customers in customer_master.
- 1 to 3 cards per customer in card_account.
- 1,000,000 authorized transactions in authorized_transactions.
- 1,000,000 posted transactions in posted_transactions.
- **5M and 10M generations** follow the same pattern
**Flexible Billing:**
- $250/month: Includes one free 200k transaction dataset a month
- $250 per additional 200k dataset
- $1,000 per additional 1M dataset
- $3,500 per additional 5M dataset
- $6,000 per additional 10M dataset
**Security and Compliance:**
- **GDPR compliant by default:** No real data required or processed.
- **PCI DSS-Aligned Practices:** While GSD - Fraud does not process real cardholder data, it is designed with **security best practices** that align with **PCI DSS principles**, ensuring that all synthetic data generation happens securely within your Snowflake environment.
- **Fully contained:** Data generation occurs entirely within your Snowflake environment.
**Legal Disclaimer:** GSD - Fraud is not affiliated with or endorsed by Galileo FT.
**Get Started:** Install GSD - Fraud directly from the Snowflake Marketplace, configure your risk settings, and start generating synthetic datasets in under a minute.
GSD-Fraud是一款面向数据科学家、机器学习工程师与风险分析师的合成数据生成应用,旨在为其提供快速、可靠且隐私安全的数据集,用于训练欺诈检测模型。该应用完全符合**通用数据保护条例(GDPR)**要求,且全程运行于您的Snowflake环境内——确保所有数据不会流出您的基础设施,也不会在外部进行任何收集或存储。
借助基于Streamlit打造的直观用户界面,您可通过简易滑块配置风险等级,调整低、中、高风险的**商户类别码(Merchant Category Code,MCC)**占比。针对20万笔交易的数据集,生成耗时可控制在30秒以内;针对100万笔交易的数据集,生成耗时则可控制在1分钟以内。所有数据均可直接在Snowflake数据表中获取,无需额外等待,让您能够专注于数据分析与模型构建工作。
**核心特性:**
- **极速生成**:20万笔交易可在30秒内完成生成,100万笔交易可在1分钟内完成生成。
- **极致隐私安全**:全程运行于您的Snowflake环境内,无任何数据在外部被收集、处理或存储。
- **数据高度逼真**:兼容行业标准**原始数据文件(Raw Data File,RDF)**规范,完美适配欺诈检测模型的训练需求。
- **风险配置便捷**:通过滑块即可调整低、中、高风险的MCC分布比例。
**数据集构成说明:**
- **20万条交易生成场景:**
- `customer_master` 表包含40,000名客户。
- `card_account` 表中每位客户对应1至3张银行卡。
- `authorized_transactions` 表包含200,000笔授权交易。
- `posted_transactions` 表包含200,000笔已入账交易。
- **100万条交易生成场景:**
- `customer_master` 表包含200,000名客户。
- `card_account` 表中每位客户对应1至3张银行卡。
- `authorized_transactions` 表包含1,000,000笔授权交易。
- `posted_transactions` 表包含1,000,000笔已入账交易。
- **500万与1000万条交易生成场景** 遵循上述相同规则。
**灵活计费方案:**
- 250美元/月:每月包含1份免费的20万笔交易数据集
- 每额外生成1份20万笔交易数据集,收费250美元
- 每额外生成1份100万笔交易数据集,收费1,000美元
- 每额外生成1份500万笔交易数据集,收费3,500美元
- 每额外生成1份1000万笔交易数据集,收费6,000美元
**安全与合规性:**
- **默认符合GDPR标准**:无需使用真实数据,也不会处理真实数据。
- **符合支付卡行业数据安全标准(PCI DSS,Payment Card Industry Data Security Standard)实践规范**:尽管GSD-Fraud不会处理真实持卡人数据,但其设计遵循了与PCI DSS原则相符的安全最佳实践,确保所有合成数据生成操作均在您的Snowflake环境内安全完成。
- **完全本地化运行**:数据生成全程仅在您的Snowflake环境内进行。
**法律声明:** GSD-Fraud与Galileo FT无任何关联,亦未获得其背书。
**快速上手:** 您可直接从Snowflake Marketplace安装GSD-Fraud,配置风险参数后,即可在1分钟内开始生成合成数据集。
提供机构:
FinThetic LLC
创建时间:
2025-03-06



