PlasmoFAB
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7433086
下载链接
链接失效反馈官方服务:
资源简介:
PlasmoFAB is a curated dataset containing amino acid sequences of proteins expressed by Plasmodium falciparum (Pf). Sequences are separated into antigen candidates and intracellular proteins. PlasmoFAB is created to provide a high-quality trainings set for machine learning models that will be used for Pf antigen exploration.
We provide PlasmoFAB in form of two separate csv files. One file, named PlasmoFAB_pos.csv, contains the positive set, i.e., all protein sequences that are antigen candidates. The other file, named PlasmoFAB_neg.csv, contains the negative set, i.e., all protein sequences that are intracellular proteins. Each sequence has a flag in the datafield "test" that indicates whether or not the sequence was used in the test set of machine learning experiments performed in the corresponding manuscript.
Additionally, the file PlasmoFAB_datasheet.md contains a datasheet for PlasmoFAB as introduced in Gebru, Timnit, et al. "Datasheets for datasets." Communications of the ACM 64.12 (2021): 86-92.
创建时间:
2023-01-20



