capitalone/PersonaLedger
收藏Hugging Face2025-12-18 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/capitalone/PersonaLedger
下载链接
链接失效反馈官方服务:
资源简介:
PersonaLedger是一个合成的金融交易数据集,包含3000万条交易记录。该数据集通过基于人物特征的大型语言模型(LLM)生成具有行为多样性的数据,并使用程序化引擎确保会计准确性。数据集主要用于两个金融任务:1)破产预测(序列分类任务):根据用户n个月的交易历史预测其未来是否会出现资金流动性不足;2)身份盗窃检测(事件级分割任务):识别隐藏在用户合法交易历史中的欺诈交易。数据集按不同时间周期(1个月和3个月)组织,包含训练集、测试集和标签文件。
PersonaLedger is a synthetic financial transaction dataset containing 30 million records. It features persona-driven LLMs for behavioral diversity and a programmatic engine to enforce accounting correctness. The dataset is designed for two main financial tasks: 1) Insolvency Prediction (sequence classification): predict whether a user will become illiquid in the near future based on n-month transaction history; 2) Identity Theft Detection (event-level segmentation): identify fraudulent transactions hidden within a users legitimate history. The dataset is organized by different time periods (1-month and 3-month) and includes training sets, test sets, and label files.
提供机构:
capitalone



