five

Accelerating Interoperability With Databricks Lakehouse

收藏
Databricks2024-05-09 收录
下载链接:
https://marketplace.databricks.com/details/aa7c7506-f11a-45a8-8b3d-7b1798c6ef8a/Databricks_Accelerating-Interoperability-With-Databricks-Lakehouse
下载链接
链接失效反馈
官方服务:
资源简介:
https://www.databricks.com/solutions/accelerators/fhir In this solution accelerator, we demonstrate how we can leverage the lakehouse approach, for an in-depth analysis of patient outcomes, using EHR data. Consider a scenario that we have a collection of FHIR bundles and want to explore the effect of different factors on Covid outcomes. However, FHIR standard is primarily designed for the exchange of information and not optimized for analytics. To solve this problem, we need to flatten the the bundles (stored as nested json files) and extract resources such as patients, encounters, conditions etc. so that we can create a dataset which is ready for exploratory data analysis. We can decompose this process in 3 main steps: * **Data ingestion** - Simplify ingestion, from all kind of sources. As example, we'll use Databricks Labs dbignite library to ingest FHIR bundle as tables ready to be queried in SQL in one line. - Query and explore the data ingested - Optionally we can secure data access * **Exploratory Analysis/Data Curation** - Create cohorts - Create a patient level data structure (a patient dashboard) from the bundles - Investigate rate of hospital admissions among covid patients and explore correlations among different factors such as SDOH, disease history and hospital admission * **Data Science / Advanced Analytics** - Create patient features - Create a training dataset to build a model predicting and analysing our cohort - Use SHAP for explaining the effect of different features on the outcome under study Click on the "Get instant access" button in the top right corner to clone the solution accelerator repo into your workspace. Once the repo is cloned into your workspace, please execute the **RUNME** notebook in the repo in order to create the cluster and job you can use to run the notebooks.
提供机构:
Databricks
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作