five

Binding specificities of human RNA binding proteins towards structured and linear RNA sequences

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/ERP107876
下载链接
链接失效反馈
官方服务:
资源简介:
Data that has been generated by High-throughput RNA SELEX (HTR-SELEX), a new method developed for analyzing the binding specificities of RNA-binding proteins. Sequence data is composed of reads generated with Illumina HiSeq2000 instruments. Samples are composed of single read sequencing of synthetic DNA fragments with a fixed length randomized region or samples derived from such a initial library by conversion to RNA and then through selection with a sequence specific RNA binding protein followed by reverse transcription. Originally multiple samples with different "barcode" tag sequences were run on the same Illumina sequencing lane but the released files have been already de-multiplexed, and the constant regions and "barcodes" of each sequence have been cut out of the sequencing reads to facilitate the use of data. Note that the sequence corresponding to the RNA-sequence is the reverse strand to the sequenced one. Barcodes and oligonucleotide designs are indicated in the names of individual entries. Depending of the selection ligand design, the sequences in each of these fastq-files are either 26 or 40 bases long and had different flanking regions in both sides of the sequence, and in the case of the 26N library designs also a fixed length TTAC (GUAA in the RNA) sequence in the middle of the read. Each run entry is named in either of the following ways: Example 1) "RBFOX1_TGTCTT40NTTC_AAG_2.fastq.gz", where name is composed of following fields ProteinName_Batch_BarcodeDesign_SelectionCycle. This experiment used barcode ligand TGTCTT40NTTC, where both of the variable flanking constant regions are indicated as they were on the original sequence-reads. This ligand has been selected for two rounds of HTR-SELEX using recombinant protein that contained the DNA binding domain of human RNA binding protein RBFOX1. It also tells that the experiment was performed on batch of experiments named as "AAG". Example 2) "ZeroCycle_TGCGAAC26N_0_0.fastq.gz" where name is composed of (zero)_BarcodeDesign_(zero) These sequences have been generated from sequencing of the initial non-selected pool. Same initial pools have been used in multiple experiments that were on different batches. Note that the background sequences for all of the 40N sequencing pools have been published before and can be found in ENA under the following entries: PRJEB20112; PRJEB3291 and ...
创建时间:
2021-02-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作