BED files containing Transposable Element (TE) annotations for 33 Drosophila melanogaster genomes
收藏DataCite Commons2024-10-09 更新2025-04-09 收录
下载链接:
https://digital.csic.es/handle/10261/242513
下载链接
链接失效反馈官方服务:
资源简介:
We annotated TE copies only in the euchromatic regions of the genome since heterochromatic regions are gene-poor (Smith et al. 2007) and its assembly and annotation usually require specific methods and extensive curation (Chakraborty et al. 2019; Khost et al. 2017). In this work, we considered as euchromatic those genomic regions determined by the recombination rate calculator (RRC) (Fiston-Lavier et al. 2010) available at http://petrov.stanford.edu/cgi-bin/recombination-rates_updateR5.pl. Such coordinates were originally calculated based on release 5 of D. melanogaster genome so we converted them to release 6 coordinates using the coord_converter.pl script from FlyBase (Gramates et al. 2017), resulting in the following regions: 2L:530,000..18,870,000; 2R:5,982,495..24,972,477; 3L:750,000..19,026,900; 3R:6,754,278..31,614,278; X:1,325,967..21,338,973. In order to determine the coordinates of the euchromatic regions in each scaffolded genome, we mapped scaffolds to the euchromatic region of the ISO1 genome using MUMmer (v3.0) (Kurtz et al. 2004). We then determined the coordinates in the scaffolded genomes by parsing MUMmer´s output and extracting the coordinates mapping at the boundaries of the euchromatic region of the ISO1 genome. After running the TEannot pipeline over the euchromatic regions of each genome, we performed a post-annotation filtering step consisting in the removal of TE copies <100bp. NOTE: Coordinates in the bed files refers to the strain’s genome, not the reference. Columns in the BED files: Columns 1-3: Chr, Start, End: Coordinates for the TE annotation. NOTE: Coordinates in the bed files refers to the strain’s genome, not the reference. Column 4: TE Name Column 5: TE Size Column 6: Strand Column 7: TE Family Column 8: TE Sueper Family Column 9: TE Order Column 10: TE Class Column 11: Source of the TE family classification. Column 12: TE Length Ratio: Length ratio: TE copy / TE consensus.
提供机构:
DIGITAL.CSIC
创建时间:
2021-06-04



