five

Gene expression count data from human post-mortem spinal cord

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6385746
下载链接
链接失效反馈
官方服务:
资源简介:
Gene expression data from human post-mortem tissue for three spinal cord sections (cervical, thoracic and lumbar) from amyotrophic lateral sclerosis (ALS) patients and non-neurological disease controls. RNA sequencing performed as part of the New York Genome Center ALS Consortium. Analysis workbooks: https://jackhump.github.io/ALS_SpinalCord_QTLs/  Preprint describing results: https://www.medrxiv.org/content/10.1101/2021.08.31.21262682v1  Sample sizes: Region Control ALS Cervical 35 139 Thoracic  10 42 Lumbar 32 122               Library preparation RNA was extracted from flash-frozen postmortem tissue using TRIzol (Thermo Fisher Scientific) chloroform, followed by column purification (RNeasy Minikit, QIAGEN). RNA integrity number (RIN) was assessed on a Bioanalyzer (Agilent Technologies). RNA-Seq libraries were prepared from 500ng total RNA using the KAPA Stranded RNA-Seq Kit with RiboErase (KAPA Biosystems) for rRNA depletion and Illumina-compatible indexes (NEXTflex RNA-Seq Barcodes, NOVA-512915, PerkinElmer, and IDT for Illumina TruSeq UD Indexes, 20022370). Pooled libraries (average insert size: 375 bp) passing the quality criteria were sequenced either on an Illumina HiSeq 2500 (125 bp paired end) or an Illumina NovaSeq (100 bp paired-end). The samples had a median sequencing depth of 42 million read pairs, with a range between 16 and 167 million read pairs. Data processing Samples were uniformly processed using RAPiD-nf, an efficient RNA-Seq processing pipeline implemented in the NextFlow framework. Following adapter trimming with Trimmomatic (version 0.36), all samples were aligned to the hg38 build (GRCh38.primary_assembly) of the human reference genome using STAR (2.7.2a), with indexes created from GENCODE, version 30. Gene expression was quantified using RSEM (1.3.1) using GENCODE v30. Quality control was performed using SAMtools and Picard, and the results were collated using MultiQC. Various technical metrics for sequencing quality control are provided in the metadata. Estimated read counts and normalised transcripts per million (TPM) matrices provided for each tissue. Provided data: gencode.v30.gene_meta.tsv.gz - tab separated table with columns "genename", the HGNC gene symbol, and "geneid" the Ensembl ID, as set in the GENCODE v30 comprehensive annotation. For {tissue} in Cervical_Spinal_Cord, Thoracic_Spinal_Cord, Lumbar_Spinal_Cord: {tissue}_metadata.tsv.gz - metadata describing each sample. Each row describes a sample. Descriptions of each column below. {tissue}_gene_tpm.tsv.gz - the normalised TPM values from RSEM for all 58,884 genes in GENCODE v30. Each row describes a gene and each column describes a sample. {tissue}_gene_counts.tsv.gz - the estimated read counts from RSEM for all 58,884 genes in GENCODE v30. Each row describes a gene and each column describes a sample. Metadata Column Description rna_id  - de-identified sample ID for each unique RNA-seq sample dna_id - de-identified donor ID for each patient enrolled in the study site_id - de-identified site name for each contributing site tissue - name of tissue/region age_rounded - age at death, rounded to nearest decade sex - biological sex of donor subject_group - long form disease group disease - short form disease group site_of_motor_onset - for ALS donors, where did symptoms start? disease_duration - for ALS donors, how long did donor live with disease?  mutations - any known ALS gene mutations library_prep - type of library preparation method used seq_platform - sequencing platform used for sequencing rin - RNA integrity number, 0-10 c9orf72_repeat_size - estimated C9orf72 repeat expansion size gPC1 - gPC5 - principal component of genetic ancestry from whole genome sequencing Remaining metadata columns are from Picard - see here: http://broadinstitute.github.io/picard/picard-metric-definitions.html#RnaSeqMetrics
创建时间:
2022-03-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作