five

Supplementary Table S1. Filtering steps of RNA-seq data.

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Supplementary_Table_S1_Filtering_steps_of_RNA-seq_data_/29376215
下载链接
链接失效反馈
官方服务:
资源简介:
Of the initial 78 samples, 8 were excluded from the analysis based on quality reports from the sequencing lab. Consequently, transcriptomic analysis was performed on the remaining 70 samples. To quantify gene expression level, Salmon (v1.10.3) was used. The following reference files were downloaded from GENCODE (https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M36/) on January 24, 2025: gencode.vM36.pc_transcripts.fa.gz, gencode.vM36.lncRNA_transcripts.fa.gz, gencode.vM36.primary_assembly.annotation.gtf.gz, GRCm39.primary_assembly.genome.fa.gz. To build the index, protein-coding and lncRNA transcript FASTA files were combined into a single reference file, and decoy sequences were generated using the corresponding genome FASTA. Indexing was performed using salmon index with default settings and the additional options -d decoys.txt-gencode. Transcript quantification was subsequently carried out using salmon quant with the following parameters: --libType A --unmatedReads --useVBOpt --seqBias --gcBias --writeQualities --geneMap "GTF" --writeUnmappedNames. The output from Salmon represented expression levels in transcripts per million (TPM). Following quantification, an additional six samples were excluded due to low mapping rates (< 35%) identified during quality control. To reduce noise in downstream analyses, only genes with TPM > 1 in at least three samples were retained. Transcript-level quantifications were then aggregated to the gene level using the R package ‘tximport’ [26]. ENSEMBL gene IDs were mapped to gene names using the org.Mm.eg.db annotation package. To facilitate downstream processing, genes were sorted based on their geometric mean TPM values across all samples, from highest to lowest. Therefore, the resulting gene-level count matrix, comprising 19341 genes across 64 blastocyst samples (26 intact and 38 twin embryos), was used for subsequent analysis.
创建时间:
2026-01-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作