Dataset and Hierarchical Clustering Code to Track the Next COVID-19 Epicenter
收藏Mendeley Data2020-06-29 更新2026-04-09 收录
下载链接:
https://data.mendeley.com/datasets/7tyw5d3ccm/1
下载链接
链接失效反馈官方服务:
资源简介:
The Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) organized an online repository (available at https://github.com/CSSEGISandData/COVID-19) with world-wide information on the absolute number of new confirmed, recovered, and death cases related to the COVID-19 disease (Coronavirus Disease 2019) caused by the Sars-CoV-2 virus (coronavirus). From the whole dataset, we have focused our analysis on the daily time series summaries, which contain the accumulated numbers of confirmed, death, and recovered cases for each country. Given some countries (e.g., Australia, Canada, and China) were reported at the province/state level, we have aggregated all those into a single time series. After that, our dataset was composed of $186$ countries and an extra time series containing cases registered in the Diamond Princess cruise ship. Next, we removed the time series related to Diamond Princess and all time series on recovered cases, just to focus our attention on confirmed and death cases. Another important modification in this dataset was performed to reorganize the daily records. Instead of using accumulated cases, we calculated the lagged differences between consecutive days. Besides the dataset, we also share our source code designed to cluster time series from different countries with similar behavior.
创建时间:
2020-06-29



