Updated gene and transposable element predictions for P. nodorum isolates SN15, SN4, SN79, and SN2000
收藏Mendeley Data2024-06-27 更新2024-06-27 收录
下载链接:
https://figshare.com/articles/dataset/Updated_gene_and_transposable_element_predictions_for_P_nodorum_isolates_SN15_SN4_SN79_and_SN2000/13340975/1
下载链接
链接失效反馈官方服务:
资源简介:
Gene, repeat, and transposable element predictions for the Parastagonospora nodorum isolates SN15, SN4, SN2000, and SN79. Because this study did not sequence these isolates and does not control the NCBI entry for them, we have opted to make them publically available here rather than through the usual NCBI/EBI route. Transposable elements and repeats were predicted using the [PanTE](https://github.com/darcyabjones/pante) pipeline.Protein coding genes were predicted from softmasked genomes using the [panann pipeline](https://github.com/darcyabjones/panann). Protein coding genes overlapping rRNA genes by more tha 50% of their length were excluded. Protein coding genes with exons overlapping genome gaps (stretches of N >= 100bp)were split into fragments, annotated in the GFF with the attribute `fragmented=true`.Note that we had some protein genes looked a bit dubious (lots of short exons).We attempted to mark these based on what support they have in the gff withthe attribute `low_confidence_prediction=true`. This attribute can be manuallyremoved if the gene looks fine to you. The rRNA genes are predicted using RNAmmer v1.2, with some predictions comingfrom repeatmasker.The tRNA genes are predicted using tRNAScan-SE v2.0.3. The SN15 annotations are an updated version of the ones published in Bertazzoni et al. (Submitted).Newly predicted genes were added as a "C" set (in addition to the existing A, and B sets) if they didn't overlap an existing annotation in the same strand by more than 20% of the length of the previous annotation annotation.All protein coding gene annotations have an attribute `confidence_set=` whichindicates the A, B (previous), or C (updated) gene predictions.
创建时间:
2023-06-28



