ExplorATE test data: a pipeline to explore active transposable elements from RNAseq data without a reference genome

NIAID Data Ecosystem2026-03-12 收录

下载链接：

https://www.ncbi.nlm.nih.gov/sra/SRP316241

下载链接

链接失效反馈

官方服务：

资源简介：

Transposable elements (TEs) are ubiquitous in genomes. Many of these TEs remain active and are an important fraction of the transcriptomes with potential effects on the host genomes. The functional impact of TEs is well known for model organisms, however, in transcriptomes analysis of non-model organisms, this information is ignored due to the difficulty in identifying and quantifying TEs. Here we develop ExplorATE, a pipeline that allows the identification and quantification of active TEs in non-model organisms that can be easily implemented within the R environment. Based on simulated data, we show that our pipeline accurately identifies and quantifies TEs, over-performing the commonly used tools in model organisms. We show the implementation of ExplorATE using real data for RNA-seq samples from different tissues (liver, ovary, and brain) of Liolaemus parthenos, the only parthenogenetic lizard known to date in the entire clade Iguanidae (pleurodonta). Our results show that a significant fraction of the transcriptome contains repeats, however many of these are co-expressed with genes. The implementation of our pipeline in real data allowed the identification of the most abundant transposon families in each tissue. The ERV2, CR1, and SINE3 families were particularly abundant in the liver. A test data set is provided in the ExplorATE package. Overall design: The goal is to test the pipeline and evaluate the expression profiles of active transposable elements with real data. For this, three samples of different tissues (brain, ovary, and liver) from the South American parthenogenetic lizard Liolaemus parthenos are analyzed and compared.

创建时间：

2021-07-31

5,000+

优质数据集

54 个

任务类型

进入经典数据集