Treehouse compendium of polyA selected RNA-Seq gene expression data from 932 cell lines
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE268098
下载链接
链接失效反馈官方服务:
资源简介:
We uniformly analyze sequence data to generate a resource for comparative gene expression studies. Specifically, we obtained access to primary RNA sequence data from repositories and clinical partners, consistently processed the data, harmonized metadata, and released the expression values and metadata without access restrictions The data contains 43 consistently processed gene expression datasets from 1 study. Gene expression in each sample is uniformly quantified using the dockerized TOIL RNA-Seq pipeline versions from 3.2 to 3.4.1 (Vivian et al., 2017); all of these versions produce bitwise identical RSEM gene expression outputs. The pipeline uses RSEM Version 1.2.25 (Li and Dewey, 2011) for quantification after aligning reads with STAR v 2.3.2a (Dobin et al., 2013) using indices generated from the human reference genome GRCh38 and the human gene models GENCODE 23 as described at https://github.com/UCSC-Treehouse/pipelines. Quality is assessed with the MEND pipeline https://github.com/UCSC-Treehouse/mend_qc (Beale et al., 2021). Data pocessing steps were as follows: Adapters are removed with CutAdapt v1.9 (Martin, 2011) Reads are aligned by STAR v 2.4.2a using indices generated from the human reference genome GRCh38 and the human gene models Gencode 23 (Dobin et al., 2013) RSEM 1.2.25 is used to quantify gene expression (Li and Dewey, 2011). Gene level expression in TPM is log transformed: log2(TPM+1) genome build: GRCh38 processed data files format and content: Gene level expression in TPM is log transformed: log2(TPM+1)
创建时间:
2024-05-28



