DocGraph subsets for StrataRX tutorial
收藏NIAID Data Ecosystem2026-03-08 收录
下载链接:
https://figshare.com/articles/dataset/DocGraph_subsets_for_StrataRX_tutorial/818983
下载链接
链接失效反馈官方服务:
资源简介:
Here are the materials for the workshop that I did with Fred Trotter on using Gephi to analyze the DocGraph data set at the 2013 StrataRX conference:
http://strataconf.com/rx2013/public/schedule/detail/29840
Details on how the GraphML and Gephi files were produced are detailed.
The edge data set is based on the DocGraph data set available from here:
http://notonlydev.com/docgraph-data/
as the V1.0 open source dataset.
The NPPES or NPI node information is from the file:
npidata_20050523-20130113.csv
The NPPES data was processed with the following script:
https://github.com/jhajagos/DocGraph/blob/master/nppes/npi_schema.sql
The subsets of the larger graph are generated by the following criteria:
jamestown_core_provider_graph.graphml
-Providers selected with practice addresses in Jamestown, NY
-179 nodes with 5,560 edges
jamestown_core_and_leaf_provider_graph.graphml
Includes providers above and those who are linked to them
1,322 nodes with 12,457 edges
albany_core_provider_graph.graphml
Providers selected with practice addresses in Albany, NY
1,368 nodes with 44,711 edges
Subsets were produced using this script
https://github.com/jhajagos/DocGraph/blob/master/extract_providers_to_graphml.py
To these files I have added latitude and longitude from geocoding the practice address generated with this script:
https://github.com/jhajagos/DocGraph/blob/master/nppes/geocode_nppes_using_arcgis.py
The locators were based on ArcLogistics 2012 release 1 data.
创建时间:
2013-10-17



