Data for "A learned score function improves the power of mass spectrometry database search"
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/records/10823120
下载链接
链接失效反馈官方服务:
资源简介:
DATA for "A learned score function improves the power of mass spectrometry database search"
These data files are associated with the following publication:
Varun Ananth, Justin Sanders, Melih Yilmaz, Sewoong Oh and William Stafford Noble. "A learned score function improves the power of mass spectrometry database search". Bioinformatics (Proceedings of the ISMB). 2024.
For the benchmarking data, we used a dataset that is publicly available on ProteomeXchange (PXD028735). The paper that introduced this dataset is:
Van Puyvelde, B., Daled, S., Willems, S., Gabriels, R., Gonzalez de Peredo, A., Chaoui, K., Mouton-Barbosa, E., Bouyssié, D., Boonen, K., Hughes, C. J., Gethings, L. A., Perez-Riverol, Y., Bloomfield, N., Tate, S., Schiltz, O., Martens, L., Deforce, D., & Dhaenens, M. (2022). A comprehensive LFQ benchmark dataset on modern day acquisition strategies in proteomics. In Scientific Data (Vol. 9, Issue 1). Springer Science and Business Media LLC. https://doi.org/10.1038/s41597-022-01216-6
More specifically, the following `.raw` files were downloaded:
LFQ_Orbitrap_DDA_Ecoli_01.raw
LFQ_Orbitrap_DDA_Human_01.raw
LFQ_Orbitrap_DDA_Yeast_01.raw
Those files can be accessed via FTP here.
We upload here the annotated .mgf files created from these .raw files, as described in our paper.
The human, yeast, and E. coli .fasta files used in all database searches were downloaded from UniProt on 11/6/23, 4:30 PM.
Bateman, A., Martin, M.-J., Orchard, S., Magrane, M., Ahmad, S., Alpi, E., Bowler-Barnett, E. H., Britto, R., Bye-A-Jee, H., Cukura, A., Denny, P., Dogan, T., Ebenezer, T., Fan, J., Garmiri, P., da Costa Gonzales, L. J., Hatton-Ellis, E., Hussein, A., … Zhang, J. (2022). UniProt: the Universal Protein Knowledgebase in 2023. In Nucleic Acids Research (Vol. 51, Issue D1, pp. D523–D531). Oxford University Press (OUP). https://doi.org/10.1093/nar/gkac1052
We include these files here, with only minor modifications to replace `U` amino acids with `X` so that all amino acids fall into Casanovo-DB's vocabulary.
创建时间:
2024-03-16



