five

Evaluation of methods for modeling transcription factor sequence specificity. synthetic construct

收藏
NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA183675
下载链接
链接失效反馈
官方服务:
资源简介:
Genomic analyses often involve scanning for potential transcription-factor (TF) binding sites using models of the sequence specificity of DNA binding proteins. Many approaches have been developed to model and learn a protein’s binding specificity by representing sequence motifs, including the gaps and dependencies between binding-site residues, but these methods have not been systematically compared. Here we applied 26 such approaches to in vitro protein binding microarray data for 66 mouse TFs belonging to various families. For 9 TFs, we also scored the resulting motif models on in vivo data, and found that the best in vitro–derived motifs performed similarly to motifs derived from in vivo data. Our results indicate that simple models based on mononucleotide position weight matrices learned by the best methods perform similarly to more complex models for most TFs examined, but fall short in specific cases. In addition, the best-performing motifs typically have relatively low information content, consistent with widespread degeneracy in eukaryotic TF sequence preferences. Overall design: Protein binding microarray (PBM) experiments were performed for a set of 86 mouse transcription factors. Briefly, the PBMs involved binding GST-tagged DNA-binding proteins to two double-stranded 44K Agilent microarrays, each containing a different DeBruijn sequence design, in order to determine their sequence preferences. Details of the PBM protocol are described in Berger et al., Nature Biotechnology 2006.
创建时间:
2012-12-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作