Supporting files for research paper "A chromosome-scale reference genome of Lathyrus sativus"
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10671531
下载链接
链接失效反馈官方服务:
资源简介:
This is a supporting dataset for the research paper: "A chromosome-scale reference genome of Lathyrus sativus"
Abstract
Grass pea (Lathyrus sativus L.) is an underutilised but promising legume crop with tolerance to a wide range of abiotic and biotic stress factors, and potential for climate-resilient agriculture. Despite a long history and wide geographical distribution of cultivation, only limited breeding resources are available. This paper reports a 5.96 Gbp genome assembly of grass pea genotype LS007, of which 5.03 Gbp is scaffolded into 7 pseudo-chromosomes. The assembly has a BUSCO completeness score of 99.1% and is annotated with 31719 gene models and repeat elements. This represents the most contiguous and accurate assembly of the grass pea genome to date.
Other supporting data files related to this submission:
raw Pacific Biosciences HiFi long reads of LS007 genomic DNA, available at PRJEB70892, FASTQ
scaffolded assembly of LS007 along with annotations as an EMBL format file GCA_963859935 under project PRJEB70892
CENH3 ChIP-seq sequencing, Illumina paired end, PRJEB54858 (run accessions ERR12509730-ERR12509733), FASTQ
Code Availability
Source code for the gene annotation is available on github (https://github.com/gitbackspacer/grasspea_annotation)
Files in this datadepository:
README_F.txt
Readme file with MD5sums of submitted files.
lathyrus.emapper.decorated.gff
Orthology based functional annotation of protein coding genes
lathyrus_interproscan.annotations.gff3The annotation of functional domains of protein coding genes.
lathy.emapper.annotations.tab.xlsx.zip
lathy.emapper.annotations.tab.xlsx
Outputs from eggNOG-mapper for Orthology based functional annotation of LS007 Lathyrus sativus genes
lathyrus_interproscan.zip
lathyrus_interproscan.tab
Outputs from Interproscan for structure and domains of annotated LS007 Lathyrus sativus genes
repeat annotation_reversed chromosomes.zip
DANTE_LTR_full_output.gff3Full-length and partial LTR-RT elements annotated by DANTE_LTR
DANTE_filtered_output.gff3Protein coding regions of transposable elements annotated by DANTE
RM_LIB_All_concatenated.fastaCustom library of mobile elements, including LTR-RT, LINE, and Class II elements
Repeat_annotations_merged_31072023_sorted.gff3Repeat annotation based on a custom library and RepeatMasker search, as well as DANTE protein domain. The GFF3 includes annotation of mobile elements and simple repeats.
all_scaf_run_all_default_annotation.gff3Tandem repeats with a monomer size 40-3000 nt annotated by TideCluster, minimum array size 5kb
all_scaf_run_all_default_tidehunter.gff3All tandem repeats with monomer size 40-3000 identified by TideHunter.
all_scaf_run_all_short_monom_annotation.gff3Tandem repeats with monomer size 10-39 nt annotated by TideCluster with a minimum array size 5kb
all_scaf_run_all_short_monom_tidehunter.gff3All tandem repeats with monomer size 10-39 nt identified by TideHunter.
创建时间:
2024-02-16



