five

Datasets of sequences, alignments and structural models generated for the structural prediction of complexes mediated by intrinsically disordered regions.

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7838023
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains input and ouput files used and generated for the scanning of intrinsically disordered region and the prediction of their binding sites to receptor proteins using the SCAN_IDR pipeline with AlphaFold2-Multimer. It contains two archives:  scanidr_data_repository_corr6J08.tar dedicated to the analysis of a dataset of 42 protein complexes non redundant with the dataset used for AlphaFold2 training, 923_elm_cases_repository.tar.gz dedicated to the analysis of 923 complexes from the ELM database. These data can be used to rerun specific sections of the pipeline and scripts provided in: https://github.com/i2bc/SCAN_IDR Dataset of 42 non redundant complexes The first archive scanidr_data_repository_corr6J08.tar contains 3 compressed directories and a README file detailing their contents : the initial raw sequence and alignment data for every chain        -> DIRECTORY fasta_msa/ the input and output data of every Alphafold run for every complex   -> DIRECTORY af2_runs/ the native reference structures    -> DIRECTORY ref_capri_curated/ The protein-peptide complex cases have been assigned a distinct index number, from 1 to 42, consistent across the several directories of the archive. Their corresponding directories are labelled as _. The models in this archive were generated using AlphaFold2-Multimer v2.2 Dataset of 923 complexes selected from the ELM database The second archive 923_elm_cases_repository.tar.gz contains input and ouput files used and generated for the analysis of 923 Eukaryotic Linear Motifs (ELM) database entries. Each ELM entry is indexed with specific integer id and is composed of a receptor and a ligand protein.   The archive contains a Table associating ELM indexes with the ELM entry information, 5 directories and a README file detailing their contents: the table describing ELM entries -> FILE Table_923ELM_uid_delimitations_info_for_archive.txt the initial raw sequence and multiple sequence alignment (MSA) data for every chain        -> DIRECTORY fasta_msa/ the concatenated MSA model for every ELM complex and protocol used -> DIRECTORY af2_elm_coali_inputs/ the best model of every AF2 protocol for every complex according to the AF2   -> DIRECTORY af2_elm_models/ the best model cut in the ligand part to select only the ELM motifs as used for the evaluation of the models -> DIRECTORY elm_cut_models/ the reference structures used for the evaluation of the models   -> DIRECTORY ref_capri_curated/ The models in this archive were generated using AlphaFold2-Multimer v2.3
创建时间:
2023-11-07
二维码
社区交流群
二维码
科研交流群
商业服务