five

Data for: Local fuzzy geographically weighted clustering: A new method for geodemographic segmentation.

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://data.mendeley.com/datasets/kd5xprhv65
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset (compressed rar file) includes the Matlab code files for "Local Fuzzy Geographically Weighted Clustering" algorithm and a shapefile containing socio-demographic data and cancer incident data across 973 block groups in Manhattan, New York. The files are: 1. LFGWC.m = The Matlab code of LFGWC (Local Fuzzy Geographically Weighted Clustering) 2. LFGWC_Call.m = The file to run the above code 3. validity.m = The Matlab code for validating the clustering output 4. licence.txt = The file describing the license terms 5. Demo = Dataset folder Demo folder contains the following: 1. Data.txt = The non-normalized dataset 2. Population.txt = Population for each polygon 3. Distance.txt = Distance among all objects 4. Centroid.txt = Initial cluster centres 5. Shapefile: Manhattan_Data.shp The shapefile has been originally downloaded from a benchmark dataset of small-area cancer incidence (Boscoe et al. 2016). The benchmark dataset includes 524,503 tumors across 13,823 block groups for the entire New York State diagnosed between 2005 and 2009 (download link: https://www.satscan.org/datasets/nyscancer/index.html). Manhattan_Data.shp shapefile includes only the county of Manhattan and not the entire NY. Data have undergone slight modifications that are explained in detail in the paper. Attributes of Manhattan_Data.shp: DOHREGION Geographic identifier CODE Unique ID code for joining data POPULATION Total population (2010 Census) White_Pop % white alone population (2010 Census) Black_Pop % black alone population (2010 Census) Asian_Pop % Asian alone population (2010 Census) Other_Pop % other race population (2010 Census) Hispanic % Hispanic population (2010 Census HH_Size Persons per household (2010 Census) LT_HS % population less than high school education (25 & over) Under_Pov % population under poverty (2006-2010 ACS Data) BC_Rate Incidents of breast cancer per 1000 people PC_Rate Incidents of prostate cancer per 1000 people TC_Rate Total cancer incidents per 1000 people For more information on the original benchmark dataset visit: https://www.satscan.org/datasets/nyscancer/index.html

本数据集(压缩RAR文件)包含针对“局部模糊地理加权聚类(Local Fuzzy Geographically Weighted Clustering)”算法的Matlab代码文件,以及涵盖纽约曼哈顿973个街区组的社会人口统计数据与癌症发病数据的形状文件(shapefile)。 所包含文件如下: 1. LFGWC.m:局部模糊地理加权聚类算法的Matlab实现代码 2. LFGWC_Call.m:用于运行上述算法的调用脚本 3. validity.m:用于验证聚类结果有效性的Matlab代码 4. licence.txt:说明许可条款的文件 5. Demo:数据集文件夹 Demo文件夹内含以下文件: 1. Data.txt:未标准化的原始数据集 2. Population.txt:各多边形对应的人口统计数据 3. Distance.txt:所有研究对象间的距离矩阵 4. Centroid.txt:初始聚类中心 5. 形状文件:Manhattan_Data.shp 该形状文件最初源自小区域癌症发病基准数据集(Boscoe等,2016)。该基准数据集涵盖2005至2009年间纽约全州13823个街区组内的524503例肿瘤确诊病例,下载链接:https://www.satscan.org/datasets/nyscancer/index.html。 本次提供的Manhattan_Data.shp仅包含曼哈顿县的相关数据,而非纽约全州范围的数据。数据集已进行小幅修改,具体修改细节已在对应论文中详细说明。 Manhattan_Data.shp包含以下字段属性: - DOHREGION:地理标识符 - CODE:用于数据关联的唯一标识码 - POPULATION:总人口数(2010年美国人口普查数据) - White_Pop:非西班牙裔白人单独人口占比(2010年美国人口普查数据) - Black_Pop:非西班牙裔黑人单独人口占比(2010年美国人口普查数据) - Asian_Pop:非西班牙裔亚裔单独人口占比(2010年美国人口普查数据) - Other_Pop:其他种族人口占比(2010年美国人口普查数据) - Hispanic:西班牙裔/拉丁裔人口占比(2010年美国人口普查数据) - HH_Size:每户平均人数(2010年美国人口普查数据) - LT_HS:25岁及以上人群中未完成高中学历的人口占比(2010年美国人口普查数据) - Under_Pov:处于贫困线以下的人口占比(2006-2010年美国社区调查(American Community Survey, ACS)数据) - BC_Rate:每千人乳腺癌发病例数 - PC_Rate:每千人前列腺癌发病例数 - TC_Rate:每千人癌症总发病例数 如需了解原始基准数据集的更多信息,请访问:https://www.satscan.org/datasets/nyscancer/index.html
创建时间:
2020-07-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作