five

Data from: A Community Resource for Exploring and Utilizing Genetic Diversity in the USDA Pea Single Plant Plus Collection

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Data_from_A_Community_Resource_for_Exploring_and_Utilizing_Genetic_Diversity_in_the_USDA_Pea_Single_Plant_Plus_Collection/24852186
下载链接
链接失效反馈
官方服务:
资源简介:
Included in this dataset are SNP and fasta data for the Pea Single Plant Plus Collection (PSPPC) and the PSPPC augmented with 25 P. fulvum accessions. These 6 datasets can be roughly divided into two groups. Group 1 consists of three datasets labeled PSPPC which refer to SNP data pertaining to the USDA Pea Single Plant Plus Collection. Group 2 consists of three datasets labeled PSPPC + P. fulvum which refer to SNP data pertaining to the USDA PSPPC with 25 accessions of Pisum fulvum added. SNPs for each of these groups were called independently; therefore SNP names that are shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus. For analysis, SNP data is available in two widely used formats: hapmap and vcf. These formats can be successfully loaded into TASSEL v. 5.2.25 (http://www.maizegenetics.net/tassel). Explanations of fields (columns) in the VCF files are contained within commented (##) rows at the top of the file. Descriptions of the first 11 columns in the hapmap file are as follows: rs#- Name of locus (i.e. SNP name) alleles- Indicates the SNPs for each allele at the locus chrom- Irrelevant for these datasets, since markers are unordered. pos- Irrelevant for these datasets, since markers are unordered. strand- Irrelevant for these datasets, since markers are unordered assembly#- required field for hapmap format. NA for these datasets center- required field for hapmap format. NA for these datasets protLSID- required field for hapmap format. NA for these datasets assayLSID- required field for hapmap format. NA for these datasets panel- required field for hapmap format. NA for these datasets QCcode- required field for hapmap format. NA for these datasets The fasta sequences containing the SNPs are also available for such downstream applications as development of primers for platform-specific markers. For more information about this dataset, contact Clarice Coyne at Clarice.Coyne@usda.gov or coynec@wsu.edu. Resources in this dataset: Resource Title: PSPPC SNPs in hapmap format. File Name: PSPPC.hmp.txt Resource Description: 66591 unanchored SNPs for the PSPPC collection in hapmap format Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: PSPPC SNP FASTA Sequences. File Name: PSPPC.fa.txt Resource Description: FASTA sequences for each allele of the PSPPC SNP dataset Resource Title: PPSPPC + P. fulvum SNPs in hapmap format. File Name: PSPPC+fulvums.hmp.txt Resource Description: 67400 SNPs from the PSPPC augmented with 25 P. fulvum accessions in hapmap format. SNP names are independent and unrelated to plain PSPPC SNP files. Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: PSPPC + P. fulvum SNP FASTA Sequences. File Name: PSPPC+fulvums.fa.txt Resource Description: FASTA sequences for each allele of the PSPPC + P. fulvum SNP dataset. SNP names are independent and unrelated to plain PSPPC SNP files. Resource Title: PSPPC + P. fulvum SNPs in vcf format. File Name: PSPPC+fulvums.vcf.txt Resource Description: 67400 SNPs from the PSPPC augmented with 25 P. fulvum accessions in vcf format. SNP names are independent and unrelated to plain PSPPC SNP files. Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: PSPPC SNPs in vcf format. File Name: PSPPC.vcf.txt Resource Description: 66591 SNPs from the PSPPC in vcf format Resource Software Recommended: TASSEL,url: http://www.maizegenetics.net/tassel Resource Title: README. File Name: Data Dictionary.docx Resource Description: These data are for the Pea Single Plant Plus Collection (PSPPC) and the PSPPC augmented with 25 P. fulvum accessions. The 6 datasets can be divided into two groups. Group 1 consists of 3 datasets labeled “PSPPC” which refer to SNP data pertaining to the USDA Pea Single Plant Plus Collection. Group 2 consists of 3 datasets labeled “PSPPC + P. fulvum” which refer to SNP data pertaining to the PSPPC with 25 accessions of Pisum fulvum added. SNPs for each of these groups were called independently; therefore any SNP name that is shared between the PSPPC and PSPPC + P. fulvum groups should NOT be assumed to refer to the same locus. For analysis, SNP data is available in two widely used formats: hapmap and vcf. These files were successfully loaded into the standalone version of TASSEL v. 5.2.25 (http://www.maizegenetics.net/tassel). Explanations of fields (columns) in the VCF files are contained within commented (##) rows at the top of the file. The first 11 columns required for the hapmap format are as follows: rs#- Name of locus (i.e. SNP name) alleles- Indicates the SNPs for each allele at the locus chrom- N/A, since markers are unordered. pos- N/A, since markers are unordered. strand- N/A, since markers are unordered assembly#- N/A center- N/A protLSID- N/A assayLSID- N/A panel- N/A QCcode- N/A The fasta sequences containing the SNPs are also available here for such downstream applications as development of primers for platform-specific markers.
创建时间:
2017-03-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作