Counts of Dengue reported in CAMBODIA: 1980-2011
收藏DataCite Commons2024-07-09 更新2025-04-16 收录
下载链接:
https://zenodo.org/records/11451277
下载链接
链接失效反馈官方服务:
资源简介:
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format.Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:- Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported.- Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
泰科计划(Project Tycho)数据集收录了全球各国报告的疾病病例数数据。泰科计划数据整理团队从各类权威来源提取上述病例数,此类来源通常为国家或国际卫生机构,例如美国疾病控制与预防中心(US Centers for Disease Control)与世界卫生组织(World Health Organization)。这些原始数据源涵盖开放获取与受限访问两类。针对受限访问数据源,泰科计划团队已获得数据贡献方的再分发许可。所有数据集均保留与原始发布源完全一致的病例数,泰科计划团队未对任何病例数进行修改。
泰科计划团队已完成数据集预处理工作,新增了可提升数据可解释性的变量,例如标准化的疾病与位置标识符。我们还将数据统一格式化为标准数据格式。每个泰科计划数据集仅针对某一特定疾病(例如麻疹)与某一特定国家(例如美国)收录病例数,病例数按时间间隔进行报告。
除病例数之外,数据集还包含与该类计数相关的属性信息,例如发病地点、年龄组、亚人群、诊断确定性、感染来源,以及我们提取病例数的原始数据源。一个数据集可包含多组病例数时间序列,例如“美国疾病控制与预防中心报告的美国麻疹病例数”“世界卫生组织报告的美国麻疹病例数”,或是“境外输入性美国麻疹病例数”等。
根据数据集的预期用途,我们建议在开展分析前完成以下若干数据处理步骤:
1. 处理缺失数据:泰科计划数据集未收录未报告病例数的时间区间(多数数据集的病例数时间序列因源文档不完整而存在缺失,用户需补充未记录计数的时间区间)。需注意,泰科计划数据集已收录报告病例数为0的时间区间。
2. 区分累积型与非累积型时间区间序列:泰科计划数据集中的病例数时间序列可分为“累积型”与“固定间隔型”两类。累积型病例数时间序列由重叠的病例数区间构成,这些区间均起始于同一日期,但终止日期各不相同。例如,某累积计数时间序列中的每个区间均始于1月1日,终止日期则分别为1月7日、14日、21日等。公共卫生机构通常会以累积时间区间的形式报告病例数。固定间隔型病例数序列则由互斥的时间区间构成,所有区间的起始与终止日期均不相同,但区间长度(日、周、月、年)完全一致。鉴于这两类病例数数据的性质存在差异,我们已为每个计数值增设名为“PartOfCumulativeCountSeries”的属性,以标注其所属类型。
提供机构:
University of Pittsburgh
创建时间:
2017-11-02



