five

Supplementary Table S9: Results for All Rice-human Structure Hit Pairs after Filtering (3D interaction + Amino Acid Gotoh-Smith-Waterman method)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Supplementary_Table_S9_Results_for_All_Rice-human_Structure_Hit_Pairs_after_Filtering_3D_interaction_Amino_Acid_Gotoh-Smith-Waterman_method_/29889800
下载链接
链接失效反馈
官方服务:
资源简介:
Table data of rice-human structure pair hits in Foldseek (3D interaction + Amino Acid Gotoh-Smith-Waterman method) used in Figure 3 and Supplementary Figure S5. The results of structural alignment using the Foldseek (3D interaction + Amino Acid Gotoh-Smith-Waterman method) with rice structural prediction data as query, filtered based on the defined filtering conditions (more details, see Materials and Methods section), resulted in 2,945 / 14,997 pairs in the rice upregulated gene group and 3,708 / 13,467 pairs in the rice downregulated gene group. In addition to this information, the following details are also integrated: “rice genes included in the enrichment analysis results (column name: enrichment_result),” “whether the same hits were obtained using the foldseek TM-method (column name: foldseek_tm_method),” “human HN-scores obtained from previous studies (column name: human HN-score),” and “whether the genes were classified into the upregulated gene group (500 genes) or the downregulated gene group (500 genes) in previous human studies (column name: is_in_human_top500, is_in_human_bottom500).” For convenience, we additionally provide filtered subsets listing only the unique LS–HS hit pairs whose human counterparts belong to the previously defined human upregulated (top 500 by HN-score) or downregulated (bottom 500 by HN-score) gene groups. These overlap-only lists are provided as Table S9-7 (from Table S9-3) and Table S9-8 (from Table S9-6) and can be identified using the columns “is_in_human_top500” and “is_in_human_bottom500”. The Human HN-score integrates the following datasets: Table S5: HN-score Data for All Human Genes (https://doi.org/10.6084/m9.figshare.23944935) 【Filtering Condition】 (1) The coverage of both rice and human structural alignments is greater than 50%. (2) If a structure with a different UniProt accession number from the same rice gene aligns with a structure of the same human protein structure, the hit with the highest coverage from the alignment result for the rice structure is selected. The Local Distance Difference Test (lDDT) score is used if the hits are identical. (3) Human protein structure information was matched to the current stable Ensembl gene ID and HUGO Gene Nomenclature Committee (HGNC) gene symbol using TogoID, an identifier (ID) conversion tool (accessed on 9 August 2025), and UniProt accessions as keys, and those that met this condition were selected. Table S9-1: foldseek_result_rice_up_domain_panhomology_info_filtercondtion3_250831_add_human_HNscore: All structure pair data (2945 pairs) after filtering for structure alignment using protein structure prediction data corresponding to rice upregulated genes as query. Table S9-2: (HS-HS) foldseek_result_rice_up_domain_panhomology_info_filtercondtion3_filter1_250831_add_human_HNscore: Structure pairs in Table S9-1 that meet the following filtering conditions. High-sequence similarity / High-structural similarity (HS-HS): global sequence alignment similarity > the 75th percentile (Q3) and average lDDT ≥ the median (Q2). Table S9-3: (unique LS-HS) foldseek_result_rice_up_domain_panhomology_info_filtercondtion3_filter4_250831_add_human_HNscore: Structure pairs in Table 8-1 that meet the following filtering conditions. Low-sequence similarity / High-structural similarity (LS-HS): global sequence alignment similarity ≤ Q3 and average lDDT ≥ Q2. Additionally, the rice genes identified in Table S9-2 were excluded. Whether a hit pair is included in the enrichment analysis results of this study can be confirmed in the “enrichment_result” column (see Supplementary Table S6). Whether a hit pair was also observed using the Foldseek-TM method can be confirmed in the “foldseek_tm_method” column (see Supplementary Table S8). Table S9-4: foldseek_result_rice_down_domain_panhomology_info_filtercondtion3_250831_add_human_HNscore: All structure pair data (3,708 pairs) after filtering for structure alignment using protein structure prediction data corresponding to rice downregulated genes as query. Table S9-5: (HS-HS) foldseek_result_rice_down_domain_panhomology_info_filtercondtion3_filter1_250831_add_human_HNscore: Structure pairs in Table S9-4 that meet the following filtering conditions. High-sequence similarity / High-structural similarity (HS-HS): global sequence alignment similarity > Q3 and average lDDT ≥ Q2. Table S9-6: (unique LS-HS) foldseek_result_rice_down_domain_panhomology_info_filtercondtion3_filter4_250831_add_human_HNscore: Structure pairs in Table 8-4 that meet the following filtering conditions. Low-sequence similarity / High-structural similarity (LS-HS): global sequence alignment similarity ≤ Q3 and average lDDT ≥ Q2. Additionally, the rice genes identified in Table S9-5 were excluded. Whether a hit pair is included in the enrichment analysis results of this study can be confirmed in the “enrichment_result” column (see Supplementary Table S6). Whether a hit pair was also observed using the Foldseek-TM method can be confirmed in the “foldseek_tm_method” column (see Supplementary Table S8). Table S9-7: (unique LS–HS overlap with human top/bottom 500 genes) foldseek_result_rice_up_domain_panhomology_info_filtercondtion3_filter4_250831_add_human_HNscore_top500.tsv Filtered subset of Table S9-3 that includes only hit pairs with "is_in_human_top500" = TRUE and "is_in_human_bottom500" = FALSE. Table S9-8: (unique LS–HS overlap with human top/bottom 500) foldseek_result_rice_down_domain_panhomology_info_filtercondtion3_filter4_250831_add_human_HNscore_bottom500.tsv Filtered subset of Table S9-6 that includes only hit pairs with "is_in_human_top500" = FALSE and "is_in_human_bottom500" = TRUE.
创建时间:
2025-08-31
二维码
社区交流群
二维码
科研交流群
商业服务