Life Sciences ETL using Matillion
收藏Databricks2024-10-12 收录
下载链接:
https://marketplace.databricks.com/details/8dfb0d1b-ec41-4b3d-bd08-6771089931a5/Matillion_Life-Sciences-ETL-using-Matillion
下载链接
链接失效反馈官方服务:
资源简介:
**Overview**
This Life Sciences solution accelerator is powered by the Matillion Data Productivity Cloud, and demonstrates building a multi tier lakehouse data architecture to extract, load, transform and integrate large volumes of data, making it ready for statistical analysis.
**Use cases**
Use cases include:
- Building a medallion data architecture
- API connectivity and source file handling
- Integrating, summarizing and pivoting data for statistical analysis and machine learning techniques such as dimensionality reduction and clustering.
**Product details**
The solution accelerator includes:
- Downloadable Matillion Data Productivity Cloud files: custom API connector and ETL pipelines
- Step-by-step instructions
**Additional Insights**
For more details, refer to the Matillion Exchange listings:
- Downloadable [Custom API Connector](https://exchange.matillion.com/custom-connector/profiles/tcga-genomic-data-commons-data-portal/)
- Downloadable [ETL pipelines](https://exchange.matillion.com/data-productivity-cloud/pipeline/tcga-genomic-data-commons/)
More information on:
- The [Matillion Data Productivity Cloud](https://www.matillion.com/data-productivity-cloud)
- The [Cancer Genome Atlas Program](https://www.cancer.gov/ccg/research/genome-sequencing/tcga)
提供机构:
Matillion
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集为基于Matillion数据生产力云的生命科学ETL解决方案,提供构建多层级数据架构、API连接及数据整合转换功能,包含可下载的API连接器和ETL管道工具,数据源自癌症基因组图谱计划(TCGA)。
以上内容由遇见数据集搜集并总结生成



