DGraph-Fin
收藏科学数据银行2023-11-10 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=c0f36252829c48debe2cc95356f3041c
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is set against the background of intelligent risk control and includes a directed dynamic graph that is desensitized and has fraudulent users, reflecting the social network relationships of credit users. Sample data from different time periods of actual business, where node A is an internet credit user, and the directed edge from node A to node B represents user A recording user B as an emergency contact; The type of edge represents the classification of emergency contacts, and the edge attribute is temporal information (desensitized to a positive integer starting from 1, in days) to form a heterogeneous dynamic graph. The network dataset contains over 3.7 million nodes and is divided into the following three categories, with specific definitions as follows: (1) Fraudulent users: have experienced credit overdue and fraudulent behavior in their past lending behavior; Marked as positive samples, including node 15509, accounting for 0.42%; (2) Normal users: have never experienced credit overdue or fraudulent behavior in their past lending behavior; Marked as negative samples, including nodes 1210092, accounting for 32.7%; (3) Background node: There has been no past lending behavior, and unmarked is not used as a fraud detection target. It is only used to supplement the connectivity and neighborhood background information of social networks, including a total of 2474949 nodes, accounting for 66.9%; The nodes contain 17 dimensional desensitized attribute vectors, each corresponding to different elements of attribute information, and the missing values are supplemented with -1 to the corresponding attribute vectors. In consideration of privacy protection, only the category of the attribute is provided in the original data, as shown below: (1) User ID: The unique identification number corresponding to the user's credit account; (2) Basic personal information: This includes basic personal identity information such as the user's age and gender; (3) Communication method: Information related to contact information such as phone numbers; (4) Lending behavior: This includes information such as "repayment maturity date" and "actual repayment date" to describe user lending behavior; (5) Emergency contact information: This includes the basic information of the emergency contact provided during credit account registration, such as the name, contact information, and final update date of the emergency contact person; At the same time, the network dataset contains over 4.3 million edges, which are divided into 11 categories based on the categories of emergency contacts, It corresponds to the "emergency contact information" in the node attribute, so for privacy protection reasons, the original dataset is represented by 1 to 11, and the actual direction of the undisclosed category
提供机构:
Lanze Zhang
创建时间:
2023-07-18



