five

PricePortal: an open multi-state corpus of hospital price transparency machine-readable files for California and Indiana

收藏
DataCite Commons2026-05-07 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19941038
下载链接
链接失效反馈
官方服务:
资源简介:
PricePortal is an open, reproducible corpus harmonizing federal Hospital Price Transparency (HPT) machine-readable files (MRFs) for 528 Medicare-certified hospitals in California and Indiana with Medicare OPPS/MPFS benchmarks and ZIP-level community-risk variables. This deposit contains 16 analysis-grade parquet files plus a data dictionary and SHA-256 manifest (6.42 GB total) covering: (1) gross + cash chargemaster prices (305M rows, 478 hospitals); (2) payer-specific negotiated rates (417M rows, 411 hospitals); (3) Medicare CPT/HCPCS benchmark (9,709 codes, OPPS preferred + MPFS fallback); (4) hospital×code price-to-Medicare ratio panel (1,528,609 rows, four price types per row); (5) state and per-payer summary tables; (6) Wang 2023 within-hospital correlation replication, with a schema-stratified disambiguation test (kaiser_2023, misc_csv, §180 v1.x/v2.x/v3.x) and a discounter-threshold sensitivity analysis (5%, 15%, 30%); (7) Chang & Psek 2024 ZIP-level socioeconomic gradient extension (257 hospital ZIPs; pooled cash-ratio βlog_inc=−0.96, p=0.063); (8) hospital identity crosswalk (CCN ↔ OSHPD ↔ NPI ↔ EIN, 528 hospitals; 95.6% NPI coverage); (9) Indiana ZIP-level ACS 2024 5-year demographics and CDC PLACES health outcomes. Headline findings. Median chargemaster-to-Medicare ratio is 3.92× in California vs 2.19× in Indiana; cash 1.42× vs 0.90×; minimum-negotiated 0.95× vs 0.55×, consistent with Indiana HEA 1004's Medicare-repricing benchmark. Within hospital, the gross–cash log-correlation is ≈1.0 across every MRF schema bucket — including pre-§180 Kaiser legacy chargemasters and custom non-§180 formats — weighing decisively against a v3-template encoding-artifact explanation. See DATA_DICTIONARY.md for full schema, provenance, and caveats per file. File integrity is verifiable via MANIFEST.sha256. Source code: iupui-soic/mrf-pricing-research. Public portal: pricingapp.streamlit.app.

PricePortal是一个开放且可复现的语料库,将加利福尼亚州与印第安纳州共计528家医疗保险认证医院的联邦医院价格透明度(Hospital Price Transparency, HPT)机器可读文件(Machine-Readable Files, MRF),与医疗保险门诊预付费系统(Outpatient Prospective Payment System, OPPS)/医师费用支付表(Medicare Physician Fee Schedule, MPFS)基准以及邮编层级的社区风险变量进行了标准化整合。本数据集包含16份符合分析级标准的Parquet文件,外加一份数据字典与SHA-256校验清单(总大小6.42 GB),涵盖以下内容: 1. 总费用与现金结算价目表价格(共3.05亿行,覆盖478家医院); 2. 特定支付方协商费率(共4.17亿行,覆盖411家医院); 3. 医疗保险现行医疗术语编码(Current Procedural Terminology, CPT)/医疗保健通用程序编码系统(Healthcare Common Procedure Coding System, HCPCS)基准(共9709个编码,包含OPPS优先编码与MPFS备选编码); 4. 医院-编码层级的价格-医疗保险基准比值面板数据(共1528609行,每行包含4种价格类型); 5. 州级与按支付方分类的汇总表; 6. Wang等人2023年的医院内相关性复刻研究,包含按数据模式分层的歧义消除测试(kaiser_2023、misc_csv、§180 v1.x/v2.x/v3.x)与折扣阈值敏感性分析(5%、15%、30%); 7. Chang与Psek 2024年的邮编层级社会经济梯度拓展研究(覆盖257个医院所在邮编区域;合并现金价比值的收入对数回归系数为-0.96,p=0.063); 8. 医院身份交叉映射表(包含认证编号CCN ↔ 加州全州卫生规划与发展办公室编号OSHPD ↔ 国家提供者标识符NPI ↔ 雇主识别号EIN,覆盖528家医院;NPI覆盖率达95.6%); 9. 印第安纳州邮编层级的2024年美国社区调查(American Community Survey, ACS)5年周期人口统计数据与CDC PLACES健康结局数据。 核心研究发现:加利福尼亚州的价目表价格与医疗保险基准比值中位数为3.92倍,印第安纳州则为2.19倍;现金结算价比值中位数分别为1.42倍与0.90倍;最低协商费率比值中位数分别为0.95倍与0.55倍,这与印第安纳州HEA 1004法案的医疗保险重新定价基准相一致。在医院内部,总费用与现金结算价的对数相关性约为1.0,覆盖所有MRF数据模式分组——包括§180条款出台前的凯撒医疗集团遗留价目表与非§180条款的自定义格式——这有力驳斥了v3模板编码工件解释的合理性。 完整的模式说明、数据来源与各文件注意事项详见DATA_DICTIONARY.md。可通过MANIFEST.sha256验证文件完整性。 源代码:iupui-soic/mrf-pricing-research。公开访问门户:pricingapp.streamlit.app。
提供机构:
Zenodo
创建时间:
2026-05-07
二维码
社区交流群
二维码
科研交流群
商业服务