five

PlasmoFAB

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7433086
下载链接
链接失效反馈
官方服务:
资源简介:
PlasmoFAB is a curated dataset containing amino acid sequences of proteins expressed by Plasmodium falciparum (Pf). Sequences are separated into antigen candidates and intracellular proteins. PlasmoFAB is created to provide a high-quality trainings set for machine learning models that will be used for Pf antigen exploration. We provide PlasmoFAB in form of two separate csv files. One file, named PlasmoFAB_pos.csv, contains the positive set, i.e., all protein sequences that are antigen candidates. The other file, named PlasmoFAB_neg.csv, contains the negative set, i.e., all protein sequences that are intracellular proteins. Each sequence has a flag in the datafield "test" that indicates whether or not the sequence was used in the test set of machine learning experiments performed in the corresponding manuscript. Additionally, the file PlasmoFAB_datasheet.md contains a datasheet for PlasmoFAB as introduced in Gebru, Timnit, et al. "Datasheets for datasets." Communications of the ACM 64.12 (2021): 86-92.
创建时间:
2023-01-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作