five

Soybean 200 diversity panel proteome data

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Soybean_200_diversity_panel_proteome_data/30997609
下载链接
链接失效反馈
官方服务:
资源简介:
To investigate protein-level diversity and regulation, we quantified protein abundance from seedling shoot tissue at the V2 stage. To enable comprehensive protein identification across diverse accessions, we constructed a pan-protein reference database from the Wm82.a2.v1 reference genome and 37 additional high-quality soybean assemblies (Files in protein_db). This strategy maximized coverage of amino acid sequence variation and proteins absent in the reference line. Using this database and Astral-DIA (60spd) proteome, we quantified 19,700 proteins, across the entire panel. This pan-proteome approach substantially expanded protein discovery: 12,790 proteins directly came from the Wm82 v2.1 annotation, 6,051 proteins showed high sequence similarity to Wm82 v2.1 annotation, and 859 proteins were absent from Wm82. The custom protein database was constructed from a collection of 38 assemblies (folder: protein_db). We used Wm82.a2.v1 proteins with highest priority, and then added proteins from two T2T assembly showing difference with Wm82.a2.v1, and then added proteins from other assembly showing difference with Wm82.a2.v1.Soybean.200Samples.Protein.Abundance.ProID.Wm82Assigned.csv The protein abundance matrix. The first column is protein IDs from protein_db. The second and third columns are protein and gene ID information in Wm82.a2.v1. For proteins not originating from the Wm82.a2.v1 reference, a BLASTP search (with a stringent similarity threshold) was used to assign a corresponding Wm82.a2.v1 ortholog ID where possible. Proteins with no significant match in Wm82.a2.v1 (i.e., novel or highly divergent sequences) are assigned "NA".
创建时间:
2026-04-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作