five

CAGE peaks (hg38 v3)

收藏
Mendeley Data2024-06-29 更新2024-06-27 收录
下载链接:
https://figshare.com/articles/dataset/Re-processing_of_the_data_generated_by_the_FANTOM5_project_hg38_v3_CAGE_peaks/4880063/4
下载链接
链接失效反馈
官方服务:
资源简介:
CAGE peaks === This folder contains the CAGE peaks defined in the FANTOM5 project as a robust set, where genomic coordinates originally defined on hg19 are lifted to hg38 by liftOver tool. By following the update, their annotations are also updated - see “CAGE_peaks_annotation” directory in the parent directory. - Inquiries: fantom-help@gsc.riken.jp - Update: 2015-09-18 initial release 2016-07-22 ver.2 release 2017-04-14 ver.3 release Data files --- - hg38_liftover_CAGE_peaks_phase1and2.bed.gz : all lifted-over CAGE peaks using UCSC Lift-over - hg38_fair_CAGE_peaks_phase1and2.bed : fairly remapped CAGE peaks. - hg38_problematic_CAGE_peaks_phase1and2.bed : problematic CAGE peaks that were filtered out by QC - hg38_new_CAGE_peaks_phase1and2.bed : newly identified CAGE peaks in hg38 DPI clustering - hg19_droppped_CAGE_peaks_phase1and2.bed : unmapped CAGE peaks (in hg19 coordination) Description of the columns in CAGE peaks coordinates files --- This is baed on BED9 format, where the thickStart and thickEnd position represent representative TSS positions. (https://genome.ucsc.edu/FAQ/FAQformat.html#format1) - chromosome - start of CAGE peak region - end of CAGE peak region - name (ID) of the CAGE peak - score - strand of the CAGE peak - start of the representative TSS position - end of the representative TSS position (Note: end is always start+1) - rgb string for color coding (plus or minus strand only)

CAGE峰(CAGE peaks) 本文件夹收录FANTOM5项目定义的稳健型CAGE峰数据集,其原始基于hg19参考基因组的基因组坐标已通过liftOver工具(liftOver)转换至hg38版本。伴随坐标系统升级,配套注释信息也已同步更新,详情请参阅上级目录中的「CAGE_peaks_annotation」文件夹。 - 咨询邮箱:fantom-help@gsc.riken.jp - 更新日志:2015年9月18日首次正式发布;2016年7月22日发布v2版本;2017年4月14日发布v3版本 数据文件 - hg38_liftover_CAGE_peaks_phase1and2.bed.gz:经UCSC Lift-over工具完成坐标转换的全部CAGE峰压缩文件 - hg38_fair_CAGE_peaks_phase1and2.bed:经合理重定位后的CAGE峰文件 - hg38_problematic_CAGE_peaks_phase1and2.bed:经质量控制(QC)流程过滤剔除的不合格CAGE峰文件 - hg38_new_CAGE_peaks_phase1and2.bed:基于hg38参考基因组通过DPI聚类新鉴定得到的CAGE峰文件 - hg19_droppped_CAGE_peaks_phase1and2.bed:未完成坐标转换的原始hg19版本CAGE峰文件 CAGE峰坐标文件列说明 本数据集基于BED9格式(BED9)构建,其中thickStart与thickEnd字段用于表征典型转录起始位点(Transcription Start Site, TSS)的位置,详细格式规范可参考:https://genome.ucsc.edu/FAQ/FAQformat.html#format1。各列含义依次为: 1. 染色体名称 2. CAGE峰区域的起始基因组坐标 3. CAGE峰区域的终止基因组坐标 4. CAGE峰的名称(唯一标识符ID) 5. 得分值 6. CAGE峰对应的DNA链方向 7. 典型转录起始位点的起始坐标 8. 典型转录起始位点的终止坐标(注:该字段值恒为起始坐标加1) 9. 用于颜色编码的RGB字符串(仅用于区分正链与负链)
创建时间:
2023-06-28
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作