Repetitive elements of Erebia and Carex and their genomic proportions

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/8199370

下载链接

链接失效反馈

官方服务：

资源简介：

Datasets of repetitive elements in the genomes of Erebia and Carex, detected and annotated using RepeatExplorer2 (Novák et al., 2010, 2020) using low-coverage (0.1X) short read sequencing data. Data from the article "Holocentric repeat landscapes: from microevolutionary patterns to macroevolutionary associations with karyotype evolution": Cornet, C., Mora, P., Augustijnen, H., Nguyen, P., Escudero, M., & Lucek, K. (2023). Holocentric repeat landscapes: From micro-evolutionary patterns to macro-evolutionary associations with karyotype evolution. Molecular Ecology, 00, 1–19. https://doi.org/10.1111/ mec.17100 47 Erebia and 14 Carex species were analysed in genus-level analyses ("Erebia" and "Carex" folders). In addition, individuals of 4 Erebia species ("Erebia cassioides", "Erebia tyndarus", "Erebia nivalis" and "Erebia pronoe" folders) from different populations were analysed in species-level analyses. Subfolders "Individuals" and "Comparative" represent the two modes in which RepeatExplorer2 was run: the individual mode identifies repeats in each sample separately, and the comparative mode identifies repeats in all samples simultaneously, allowing comparisons between individuals and species. Files named "CLUSTER_TABLE..." are the raw output of RepeatExplorer2 and represent the overall number of reads in each cluster of repetitive element, and their annotation. Files named "COMPARATIVE_ANALYSIS_COUNTS..." are the raw output of RepeatExplorer2 in comparative mode, representing the number of reads in each cluster for each sample included in the analysis. Files named "Genome_proportion..." are the genomic proportion of each repeat annotation, calculated as the proportion of reads with the same annotation. Refer to Cornet et al. (2023) in Molecular Ecology for more details on how the data was generated, the downstream analyses and the sample names (see Tables S1, S2 and S3). References: Novák, P., Neumann, P., & Macas, J. (2010). Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinformatics, 11(1), 378. https://doi.org/10.1186/1471-2105-11-378 Novák, P., Neumann, P., & Macas, J. (2020). Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2. Nature Protocols, 15(11), Article 11. https://doi.org/10.1038/s41596-020-0400-y

本数据集收录了红眼蝶属（Erebia）与薹草属（Carex）基因组中的重复序列数据，依托低覆盖度（0.1X）短读长测序数据，通过RepeatExplorer2工具（Novák等，2010、2020）完成重复序列的检测与注释。本数据集源自论文《全着丝粒重复序列图谱：从微进化模式到与核型进化的大进化关联》： Cornet, C., Mora, P., Augustijnen, H., Nguyen, P., Escudero, M. & Lucek, K. (2023). 该论文发表于《Molecular Ecology》，第00卷，页码1–19，DOI: 10.1111/mec.17100。本研究共开展两类分析：其一为属水平分析，共纳入47种红眼蝶属物种与14种薹草属物种，对应数据文件夹为"Erebia"与"Carex"；其二为物种水平分析，针对4种红眼蝶（即Erebia cassioides、Erebia tyndarus、Erebia nivalis及Erebia pronoe）的不同种群个体进行分析，对应数据文件夹为上述4个物种的专属文件夹。子文件夹"Individuals"与"Comparative"对应RepeatExplorer2的两种运行模式：个体模式会单独对每个样本中的重复序列进行鉴定；比较模式则同时对所有纳入分析的样本进行重复序列鉴定，支持个体与物种间的比较研究。以"CLUSTER_TABLE..."开头的文件为RepeatExplorer2的原始输出结果，记录了每个重复序列簇的总读段数及其注释信息。以"COMPARATIVE_ANALYSIS_COUNTS..."开头的文件为比较模式下RepeatExplorer2的原始输出结果，展示了每个分析样本在各重复序列簇中的读段数。以"Genome_proportion..."开头的文件为各重复序列注释对应的基因组占比，该占比通过具有相同注释的读段占总读段的比例计算得到。如需了解数据生成流程、下游分析方法及样本命名规则的详细信息，请参阅发表于《Molecular Ecology》的Cornet等（2023）论文（详见附表S1、S2与S3）。参考文献： 1. Novák, P., Neumann, P. & Macas, J. (2010). 基于图聚类的下一代测序数据重复序列表征. BMC Bioinformatics, 11(1), 378. DOI: 10.1186/1471-2105-11-378 2. Novák, P., Neumann, P. & Macas, J. (2020). 利用RepeatExplorer2对未组装序列读段中的重复DNA进行全局分析. Nature Protocols, 15(11), Article 11. DOI: 10.1038/s41596-020-0400-y

创建时间：

2023-08-15