five

Data and code associated with the manuscript "Short tandem repeats delineate gene bodies across eukaryotes" (Reinar et al.)

收藏
DataCite Commons2024-12-22 更新2025-01-06 收录
下载链接:
https://figshare.com/articles/dataset/Data_and_code_associated_with_the_manuscript_Short_tandem_repeats_delineate_gene_bodies_across_eukaryotes_Reinar_et_al_/25427812
下载链接
链接失效反馈
官方服务:
资源简介:
The "Rev_Pipeline_and_Figures" Jupyter Notebook contains code required to reproduce results and figures in the manuscript including the repetitiveness function run on simulated data to report repetitiveness of A and AT motifs.The repository also includes theTimeTree phylogenetic tree for 891 species (.nwk, "SourceData.Figure1d_2")Relative positions of ULTRA monomer and dimer STRs, and lengths of intergenic sequences (.hdf5, "SourceData.Figure2a_2")Complete repetitiveness analysis with metadata (.tsv, "SourceData.Figure4-5")Transcription factor binding site analyses (.tsv, "SourceData.Figure7e_2")List of housekeeping functions (.txt, "SourceData.Figure7c_2")GO analysis (.tsv, "SourceData.Figure7c_3")The remaining Source Data can be found with the paper.

《Rev_Pipeline_and_Figures》Jupyter Notebook包含复现该论文手稿中所有结果与图表所需的代码,其中包括针对模拟数据运行的重复度分析函数,用于统计A基序与AT基序的重复特性。本数据集仓库还包含以下内容:891个物种的TimeTree系统发育树(文件格式为.nwk,文件标识为"SourceData.Figure1d_2");ULTRA单体与二聚体短串联重复序列(Short Tandem Repeat, STR)的相对位置信息及基因间序列长度数据(文件格式为.hdf5,文件标识为"SourceData.Figure2a_2");附带元数据的完整重复度分析数据集(文件格式为.tsv,文件标识为"SourceData.Figure4-5");转录因子结合位点分析结果(文件格式为.tsv,文件标识为"SourceData.Figure7e_2");持家功能清单(文件格式为.txt,文件标识为"SourceData.Figure7c_2");基因本体(Gene Ontology, GO)分析结果(文件格式为.tsv,文件标识为"SourceData.Figure7c_3")。其余源数据可随该论文一并获取。
提供机构:
figshare
创建时间:
2024-11-19
二维码
社区交流群
二维码
科研交流群
商业服务