five

Best practices for eCLIP experiments and analysis [poor quality]

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE107767
下载链接
链接失效反馈
官方服务:
资源简介:
Enhanced cross-linking immunoprecipitation (eCLIP) featuring a size-matched input control has been recently applied to profile the binding sites of more than one hundred RNA binding proteins (RBPs). However computational pipelines and quality control metrics needed to process CLIP data at scale have yet to be well defined. Here, we describe our ENCODE eCLIP processing pipeline (https://github.com/YeoLab/eclip), enabling users to go from raw reads to processed peaks that are enriched above paired input, reproducible across biological replicates, and can be directly compared against the public ENCODE eCLIP resource. In particular, we discuss processing steps designed to address common artifacts, including properly quantifying unique RNA fragments bound by both unique genomic- and repetitive element-mapped reads. Using manual quality annotation of 350 ENCODE eCLIP experiments, we develop metrics for quality assessment of eCLIP experiments prior to and after sequencing, including library yield, number of unique fragments in the library, total binding relative information, and biological reproducibility. In particular, we quantify the commonly believed linkage between depth of sequencing and peak discovery, and derive methods for estimating required sequencing depth based on pre-sequencing metrics. Finally we provide recommendations for the common question of integrating RBP binding information with RNA-seq to generate splicing maps representing the positional effect of binding on alternative splicing. These pipelines and QC metrics enable large-scale processing and analysis of eCLIP data, and will help to standardize rigorous analysis of RBP binding data. eCLIP-seq in HepG2 and K562 Cells. Datasets in this subseries failed technical and manual quality assessment and are submitted to GEO only as negative controls. These datasets should never be used to draw biological conclusions.
创建时间:
2020-08-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作