Spatial Clusters of Childhood Cancer: Benchmarking Data. As published in Schündeln et al. 2021 Cancer Epidemiology & Data in Brief
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://data.mendeley.com/datasets/3hrg9tpsx9
下载链接
链接失效反馈官方服务:
资源简介:
Incidence of newly diagnosed childhood cancer (140/1,000,000 children under 15 years) and nephroblastoma (7/1,000,000) was simulated. Clusters of defined size (1-50) were randomly assembled on the district level in Germany. Each cluster was simulated with ten different relative risk levels (1 to 100). For each combination 2000 iterations were done. Simulated data was then analysed by three local clustering tests: Besag-Newell (BN) method, spatial scan statistic (SSS) and Bayesian Besag-York-Mollié with Integrated Nested Laplace Approximation approach (BYM).
See references for published manuscripts.
RAW DATA:
The simulated raw data is reported in the Rdata files: "AllMalignancies.Rdata" and " NephroblastomaSimulation.Rdata". These files contain 6 lists for the different cluster sizes ("Cluster Size X"). Within each of these lists 2000 simulations for clusters in 10 different risk levels ("RR Y Cluster") and the corresponding simulated cases for each of the respective scenario ("RR Y SimCases") are found. In addition, each file contains the population of children under 15 years for each district (“District Population”) and the expected cases for the entities, all cancer or nephroblastoma, (“Expected Cases”) per district.
Adjacency matrix for the 402 German districts is added as separate Rdata file.
The code and the GADM shape files to reproduce the original simulation and published study at: https://github.com/Pediatrics/Childhood-Cancer-Study
ANALYZED DATA:
Operating characteristics of each of the various cluster detection methods and scenarios in this study is reported according to the quality criteria detailed below ("Analyzed Data.xlsx")
Minimum Power (MP): Proportion of simulations detecting at least one district of the true cluster
Exact Power (EP): Proportion of simulations detecting the true cluster without false positives
Sensitivity (sens): Proportion of correctly detected districts in the true cluster
Specificity (spec): Percentage of normal risk districts, correctly classified as normal risk districts
Positive predictive value (PPV): Proportion of districts in the detected cluster belonging to the true cluster
Negative predictive value (NPV): Proportion of districts not labeled as a risk cluster that is not part of the true cluster
Correct classification (CC): Percentage of correctly classified districts of all districts
Correct proportion (CP): Correctly labeled districts of all detected potential HR districts
Positive diagnostic likelihood (PDL): The ratio of high-risk districts being detected, divided by the probability non-HR districts being detected
Negative diagnostic likelihood (NDL): The ratio of high-risk districts not being detected divided by the probability of non-high-risk districts not being detected
False positive rate (FPR): Incorrectly labeled high-risk districts of all detected high-risk districts
False negative rate (FNR): Incorrectly labeled normal-risk districts of all detected normal-risk districts
创建时间:
2021-01-04



