five

KC-OverdoseModels2025/Dataset_Pairs_2021-2025_Census-Rate

收藏
Hugging Face2026-03-28 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/KC-OverdoseModels2025/Dataset_Pairs_2021-2025_Census-Rate
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit size_categories: - 1K<n<10K --- # Fentanyl Overdose Forecasting — Lagged Census-to-Overdose Datasets **Author: Ansh Gupta** ## Overview Seven paired census-to-overdose datasets for King County, Washington, designed for temporal forecasting of opioid/fentanyl overdose death rates at the census tract level. Each dataset pairs American Community Survey (ACS) socioeconomic features from one year with overdose rates from a subsequent year, creating a lagged prediction framework. ## Dataset Structure Each file pairs **input** census features from year *t* with **target** overdose rates from a future year, at either a 1-year lag (t-1) or 2-year lag (t-2). ### t-1 Lag (1-Year Gap) | File | Census Input | Overdose Target | Tracts | |---|---|---|---| | `lagged_2021census_2022overdose_no_cRATE.csv` | 2021 ACS (2017–2021) | 2022 rates | 308 | | `lagged_2022census_2023overdose.csv` | 2022 ACS (2018–2022) | 2023 rates | 463 | | `lagged_2023census_2024overdose.csv` | 2023 ACS (2019–2023) | 2024 rates | 463 | | `lagged_2024census_TTMoverdose.csv` | 2024 ACS (2019–2023) | 2025 TTM rates | 463 | ### t-2 Lag (2-Year Gap) Located in the `t-2/` subdirectory. | File | Census Input | Overdose Target | Tracts | |---|---|---|---| | `lagged_2021census_2023overdose_no_cRATE.csv` | 2021 ACS (2017–2021) | 2023 rates | 308 | | `lagged_2022census_2024overdose.csv` | 2022 ACS (2018–2022) | 2024 rates | 463 | | `lagged_2023census_TTMoverdose.csv` | 2023 ACS (2019–2023) | 2025 TTM rates | 463 | > **Note**: The 2021 files contain 308 tracts and 57 columns. These files lack the `Rate_Independent` column (and associated neighbor rate columns) because no prior-year overdose rate data was available for the 2021 input year. All other files contain 463 tracts and 60 columns. ## Variables ### Identifiers | Column | Description | |---|---| | `GIDTR` | Census tract FIPS code | | `State` / `State_name` | State FIPS code and name | | `County` / `County_name` | County FIPS code and name | | `Tract` | Tract number | | `Num_BGs_in_Tract` | Number of block groups in tract | ### Socioeconomic Features (ACS) | Column | Description | |---|---| | `Med_HHD_Inc_Thousands_ACS__` | Median household income (thousands $) | | `pct_Prs_Blw_Pov_Lev_ACS__` | % persons below poverty level | | `pct_Civ_unemp_16p_ACS__` | % civilian unemployed (age 16+) | | `pct_Not_HS_Grad_ACS__` | % without high school diploma | | `pct_College_ACS__` | % with college degree | | `pct_Renter_Occp_HU_ACS__` | % renter-occupied housing units | | `pct_Vacant_Units_ACS__` | % vacant housing units | ### Demographics | Column | Description | |---|---| | `Tot_Population_ACS__` | Total population | | `pct_Males_ACS__` | % male | | `pct_NH_White_alone_ACS__` | % non-Hispanic White | | `pct_NH_Blk_alone_ACS__` | % non-Hispanic Black | | `pct_Pop_25_44_ACS__` | % age 25–44 | | `pct_Pop_Below25_ACS__` | % age below 25 | | `pct_Pop_45plus_ACS__` | % age 45+ | | `Pop_density_calculated_ACS` | Population density (per unit land area) | | `LAND_AREA` | Land area of tract | ### Healthcare Access | Column | Description | |---|---| | `No_Health_Ins_ACS__` | Count of persons without health insurance | | `Pct_No_Health_Ins_CALCULATED_ACS__` | % without health insurance | | `dist_general_km` | Distance to nearest general healthcare facility (km) | | `dist_sud_km` | Distance to nearest substance use disorder (SUD) facility (km) | | `general_facility_count` | General healthcare facilities in tract | | `sud_facility_count` | SUD treatment facilities in tract | | `general_facility_count_5km` | General facilities within 5 km | | `sud_facility_count_5km` | SUD facilities within 5 km | | `general_facilities_per_10k` | General facilities per 10,000 population | | `sud_facilities_per_10k` | SUD facilities per 10,000 population | | `general_facilities_per_10k_5km` | General facilities per 10k within 5 km | | `sud_facilities_per_10k_5km` | SUD facilities per 10k within 5 km | | `accessibility_general_2sfca_*_per10k` | 2-Step Floating Catchment Area accessibility scores (10/20/30 km) | | `accessibility_sud_2sfca_*_per10k` | SUD-specific 2SFCA accessibility scores (10/20/30 km) | ### Overdose Outcome Variables | Column | Description | |---|---| | `Rate` | **Target variable** — overdose death rate per 100,000 | | `Count` | Overdose death count (may be suppressed as "1-9") | | `Rate_M` | Rate margin/reliability indicator | | `Rate_M_CI` | Rate confidence interval | | `Intent` | Intent classification (Drug_OD) | | `Period` | Time period of overdose data | ### Historical Rate Features | Column | Description | |---|---| | `Rate_Independent` | Prior-year overdose rate (not available in 2021 files) | | `Rate_Independent_Neighbor_1` | Prior-year rate of nearest neighbor tract | | `Rate_Independent_Neighbor_Avg` | Average prior-year rate of 3 nearest neighbor tracts | ### Spatial Neighbor Features | Column | Description | |---|---| | `Med_HHD_Inc_Thousands_ACS___Neighbor_1` | Median income of nearest neighbor tract | | `Med_HHD_Inc_Thousands_ACS___Neighbor_Avg` | Average median income of 3 nearest neighbors | | `pct_Renter_Occp_HU_ACS___Neighbor_1` | Renter rate of nearest neighbor | | `pct_Renter_Occp_HU_ACS___Neighbor_Avg` | Average renter rate of 3 nearest neighbors | | `pct_Prs_Blw_Pov_Lev_ACS___Neighbor_1` | Poverty rate of nearest neighbor | | `pct_Prs_Blw_Pov_Lev_ACS___Neighbor_Avg` | Average poverty rate of 3 nearest neighbors | | `pct_Vacant_Units_ACS___Neighbor_1` | Vacancy rate of nearest neighbor | | `pct_Vacant_Units_ACS___Neighbor_Avg` | Average vacancy rate of 3 nearest neighbors | | `Pct_No_Health_Ins_CALCULATED_ACS___Neighbor_1` | Uninsured rate of nearest neighbor | | `Pct_No_Health_Ins_CALCULATED_ACS___Neighbor_Avg` | Average uninsured rate of 3 nearest neighbors | ## Data Sources - **Census data**: American Community Survey (ACS) 5-year estimates, U.S. Census Bureau - **Overdose rates**: CDC WONDER, drug overdose death rates per 100,000 population - **Healthcare facilities**: SAMHSA treatment locator and general healthcare facility databases - **Accessibility scores**: Computed using the 2-Step Floating Catchment Area (2SFCA) method ## Region - **Geography**: King County, Washington (includes Seattle metro area) - **Unit of observation**: Census tract ## Usage ```python import pandas as pd # Load a single dataset (e.g., 2024 census → 2025 overdose prediction) df = pd.read_csv("lagged_2024census_TTMoverdose.csv") # Features (X) and target (y) X = df[["Med_HHD_Inc_Thousands_ACS__", "pct_Renter_Occp_HU_ACS__", ...]] y = df["Rate"] ``` ## License MIT
提供机构:
KC-OverdoseModels2025
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作