five

iPSCORE Phenotype Metadata: Element Coordinate Bed Files

收藏
DataCite Commons2024-11-13 更新2025-04-15 收录
下载链接:
https://plus.figshare.com/articles/dataset/iPSCORE_Phenotype_Metadata_Element_Coordinate_Bed_Files/27327879
下载链接
链接失效反馈
官方服务:
资源简介:
This directory contains 6 files for the genomic coordinates in the hg38 build for the genes, ATAC-seq peaks, and H3K27ac ChIP-seq peaks from three tissues from the iPSCORE Collection; induced pluripotent stem cells (iPSCs), iPSC-derived cardiovascular progenitor cells (CVPCs), and iPSC-derived pancreatic progenitor cells (PPCs). Each file has a BED-like format. The five ATAC-seq and H3K27ac ChIP-seq peak files have a [tissue_phenotype_peaks.bed] labeling convention and have the same columns, including <b>Chromosome</b>, <b>Start</b>, <b>End </b>describing the genomic coordinates, <b>Element_ID </b>the identifier for the ATAC-seq or ChIP-seq peak, and <b>Expressed </b>TRUE/FALSE based on whether the peak is considered accessible/acetylated after filtering. To obtain the elements tested for QTLs, rows can be filtered by <b>Expressed</b> == "TRUE".Since gene coordinates are fixed, the <i>gene_info.txt.gz</i> file contains information about which genes were considered expressed for all three tissues. The first three columns are the chromsome, start and end Gencode hg38 coordinates, the strand, gene ID and gene name are reported in the next three columns. The last three columns are <b>iPSC_Expressed</b>, <b>CVPC_Expressed</b>, and <b>PPC_Expressed </b>and indicate (TRUE/FALSE) whether the gene is expressed in the corresponding tissue and tested for QTLs.

本目录包含6个文件,涵盖iPSCORE资源库(iPSCORE Collection)三种组织的基因、ATAC-seq峰及H3K27ac ChIP-seq峰的hg38版本基因组坐标,三种组织分别为诱导多能干细胞(induced pluripotent stem cells, iPSCs)、iPSC来源的心血管祖细胞(CVPCs)以及iPSC来源的胰腺祖细胞(PPCs)。所有文件均采用类BED格式。其中5个ATAC-seq与H3K27ac ChIP-seq峰文件遵循`tissue_phenotype_peaks.bed`命名规范,列结构完全一致,包含用于描述基因组坐标的**Chromosome(染色体)**、**Start(起始位点)**、**End(终止位点)**,用于标识ATAC-seq或ChIP-seq峰的**Element_ID(元件ID)**,以及经过过滤后判定该峰是否处于染色质可及或组蛋白乙酰化状态的**Expressed(表达状态)**,取值为TRUE/FALSE。若需获取用于数量性状基因座(quantitative trait locus, QTL)分析的检测元件,可通过筛选`Expressed == "TRUE"`的行实现。鉴于基因坐标固定不变,`gene_info.txt.gz`文件记录了在三种组织中均被纳入表达分析的基因信息。该文件前3列为Gencode注释的hg38基因组坐标对应的染色体、起始位点与终止位点,后续三列依次为链方向、基因ID与基因名称;最后三列分别为**iPSC_Expressed**、**CVPC_Expressed**与**PPC_Expressed**,用于标注该基因在对应组织中是否表达并可纳入QTL检测,取值为TRUE/FALSE。
提供机构:
Figshare+
创建时间:
2024-11-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作