Whole proteome-level GPS predictions (part 1)
收藏DataCite Commons2020-11-12 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/Whole_proteome-level_GPS_predictions_part_1_/13221719
下载链接
链接失效反馈官方服务:
资源简介:
This is the expanded set of all predictions for GPS, run on the entire reference proteome, including sites not known to be phosphorylated. This dataset is used to perform a fast update when new phosphosites are discovered. The uncompressed folder will yield a large CSV file with predictions in list format (i.e. one line per kinase-substrate prediction)<br>Columns in this order:substrate_id - unique substrate (accession_site) IDsubstrate_acc - Uniprot accession of substrate proteinsubstrate_name - Name of proteinsite - amino acid type and position (S5, means serine position 5)pep - 15-amino acid sequence centered on site of phosphorylationscore - prediction algorithm scoreKinase Name - name of kinase by our controlled ontology (found in this project)<br>Each entry indicates the protein at position (identified by peptide) and has a score weight prediction for the given kinase.<br>FOR FULL GPS RAW, you must combine this with the second zip part (GPS_split.z01). Please be sure to download both into the same directory before unzipping. The final, uncompressed file, is 24GB.<br>
本数据集为针对GPS的全量预测扩展集,基于完整参考蛋白质组运行生成,涵盖尚未被证实存在磷酸化修饰的位点。该数据集可用于在发现新磷酸化位点时快速更新预测结果。
解压后的文件夹将生成一份大型CSV文件,其预测结果以列表格式存储(即每一行对应一条激酶-底物预测结果)。
列顺序如下:
substrate_id:底物唯一标识符(登录号_位点)ID
substrate_acc:底物蛋白的UniProt登录号
substrate_name:蛋白名称
site:氨基酸类型与位点(例如S5代表第5位丝氨酸)
pep:以磷酸化位点为中心的15个氨基酸序列
score:预测算法得分
Kinase Name:基于本项目受控本体定义的激酶名称(可在本项目中查阅)
每条记录均代表经肽段鉴定的特定位点蛋白,并给出其对应指定激酶的得分权重预测值。
如需获取完整的GPS原始数据,需将本文件与第二部分压缩包GPS_split.z01合并使用。请确保将两个压缩包下载至同一目录后再执行解压操作。最终解压后的文件大小为24GB。
提供机构:
figshare
创建时间:
2020-11-12



