five

Transposable element annotation of Penicillium roqueforti LCP06136 and LCP06133

收藏
Recherche Data Gouv France2025-01-01 更新2026-04-09 收录
下载链接:
https://entrepot.recherche.data.gouv.fr/citation?persistentId=doi:10.57745/SIP7CH
下载链接
链接失效反馈
官方服务:
资源简介:
Transposable elements (TEs) were annotated on genome assemblies of Penicillum roqueforti LCP06136 (BioSample SAMN46423906) and LCP06133 (accession number GCA_030518555.1), using the REPET package (Amselem et al. 2015). Briefly, the TEdenovo pipeline (Flutre et al. 2011) was used to detect repeated elements in the genome and to provide a consensus sequence for each family. Consensus sequences were then classified using the PASTEC tool (Hoede et al. 2014), based on the Wicker hierarchical TE classification system (Wicker et al. 2007). After manual correction, the resulting library of consensus sequences was used to annotate TE copies in the whole genome, using the TEannot pipeline (Quesneville et al. 2005). The TE annotation allowed to identification of 15 and 23 subfamily elements according to the Wicker classification, covering 3.1 and 4.7% of the genome of LCP06136 and LCP06133, respectively. For comparative genomics studies, the 38 subfamily elements were pooled as "BULK" and used to annotate the genome assemblies of Penicillum roqueforti LCP06037, LCP06039, LCP06043 et LCP06059 (BioSamples SAMEA103939751, SAMEA103939752, SAMEA103939754, SAMEA103939755),as well as LCP06136 and LCP06133.

转座因子(Transposable Elements, TEs)以REPET软件包(Amselem等,2015)为分析工具,对娄地青霉(Penicillum roqueforti)LCP06136(生物样本(BioSample)编号SAMN46423906)与LCP06133(基因组组装登录号GCA_030518555.1)的基因组组装序列开展注释工作。具体流程如下:首先通过TEdenovo分析流程(Flutre等,2011)检测基因组中的重复序列,并为每个转座子家族生成共有序列;随后基于威克尔分级转座因子分类系统(Wicker hierarchical TE classification system,Wicker等,2007),使用PASTEC工具(Hoede等,2014)对所得共有序列进行分类。经人工校正后,利用TEannot分析流程(Quesneville等,2005),以最终得到的共有序列文库为参考,对全基因组范围内的转座因子拷贝进行注释。本次转座因子注释结果显示,依据威克尔分类系统,LCP06136与LCP06133的基因组中分别鉴定出15个和23个转座因子亚家族元件,其总占比分别为基因组的3.1%与4.7%。为开展比较基因组学研究,我们将这38个转座因子亚家族元件整合为"BULK"参考库,用于注释娄地青霉LCP06037、LCP06039、LCP06043以及LCP06059(生物样本编号分别为SAMEA103939751、SAMEA103939752、SAMEA103939754、SAMEA103939755)的基因组组装序列,同时也用于注释LCP06136与LCP06133的基因组。
创建时间:
2025-01-01
二维码
社区交流群
二维码
科研交流群
商业服务