five

A computational framework for identifying promoter sequences in non-model organisms using RNA-seq datasets

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP293740
下载链接
链接失效反馈
官方服务:
资源简介:
We developed a computational framework to discover short DNA sequences that confer strong expression in non-model organisms. The framework relies solely on whole genome and RNA sequencing data types, which are easily accessible to a variety of research groups. The framework proceeds in three main stages: 1) identification of a group of highly expressed loci that maintain high transcript counts across a broad range of experimental conditions, 2) extraction of the corresponding upstream candidate promoter regions of these highly expressed loci while minding nearby annotations and avoiding those that may potentially reside in operons, and 3) application of the motif finding algorithm in BioProspector to these upstream regions to predict the location and sequence of the -35 and -10 hexamers that drive the strong expression of these loci. Ultimately, we report sequences of 27-30 bases in length as candidate -35, -10 signals for each of the top loci and create a consensus motif from these predictions. We apply our framework to 80 RNA-seq datasets collected for the methanotroph Methylotuvimicrobium buryatense 5GB1 and validate our predictions computationally and experimentally. The data deposited here represent all RNA-seq data that, until this study, has not previously been published. Overall design: 80 RNA-seq datasets measuring Methylotuvimicrobium buryatense expression across 12 experimental growth conditions were compiled from a previously published data sets, additionally including their associated quality control runs, and several previously unpublished experiments. RNA-seq data were generated from a mix of Illumina protocols using a variety of read lengths (36, 50, and 150 bp) and paired and unpaired libraries. Of these 80 datasets, 24 have been previuosly published and deposited. The remaining 56 samples are deposited here.
创建时间:
2021-08-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作