金融对公知识图谱
收藏国家基础学科公共科学数据中心2024-03-05 收录
下载链接:
https://www.nbsdc.cn/general/dataDetail?id=64edc834bb16e07753c3518d&type=1
下载链接
链接失效反馈官方服务:
资源简介:
来源:某战略合作持牌企业征信企业提供的企业工商数据。
产生方法:1)编写脚本,针对数据中的企业名称、文本地址、人名、数字、时间等字段进行数据清洗;2)针对预处理后的数据,进行实体关系抽取,构建原始图谱;3)对原始图谱进行度统计分析,挖掘热点数据,进一步识别异常数据;4)将异常数据剔除;5)针对剔除后的图谱,进行股权挖掘、同地址等关系挖掘,补充更多隐性关系
主要内容:包含企业间的高管、投资关系及其衍生关系
体量:实体数量超过1亿条,关系数量超过10亿条
Source: Corporate industrial and commercial data provided by a licensed corporate credit reporting enterprise with strategic cooperation partnerships.
Generation Method:
1) Develop automated scripts to perform data cleaning on fields such as enterprise names, textual addresses, personal names, numerical values, and timestamps in the dataset;
2) Conduct entity relation extraction on the preprocessed dataset to construct the initial knowledge graph;
3) Execute degree statistical analysis on the initial knowledge graph, mine hot-spot data, and further identify abnormal data entries;
4) Remove all identified abnormal data entries;
5) Carry out relationship mining tasks including equity connection extraction and co-located enterprise relationship identification on the cleaned knowledge graph to supplement additional implicit relationships.
Core Content: Includes executive connections, investment relationships and their derived inter-enterprise relationships.
Scale: The dataset contains over 100 million entities and more than 1 billion inter-entity relationships.
提供机构:
同盾科技有限公司
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集为金融对公知识图谱,基于企业征信数据构建,通过数据清洗和实体关系抽取形成企业间的高管、投资等关系网络。其体量庞大,包含超过1亿个实体和10亿条关系,并经过异常数据剔除和隐性关系挖掘优化。
以上内容由遇见数据集搜集并总结生成



