five

PlasmoFAB

收藏
Zenodo2023-01-19 更新2026-05-26 收录
下载链接:
https://zenodo.org/record/7433086
下载链接
链接失效反馈
官方服务:
资源简介:
PlasmoFAB is a curated dataset containing amino acid sequences of proteins expressed by Plasmodium falciparum (Pf). Sequences are separated into antigen candidates and intracellular proteins. PlasmoFAB is created to provide a high-quality trainings set for machine learning models that will be used for Pf antigen exploration. We provide PlasmoFAB in form of two separate csv files. One file, named <em>PlasmoFAB_pos.csv</em>, contains the positive set, i.e., all protein sequences that are antigen candidates. The other file, named <em>PlasmoFAB_neg.csv</em>, contains the negative set, i.e., all protein sequences that are intracellular proteins. Each sequence has a flag in the datafield "test" that indicates whether or not the sequence was used in the test set of machine learning experiments performed in the corresponding manuscript. Additionally, the file <em>PlasmoFAB_datasheet.md</em> contains a datasheet for PlasmoFAB as introduced in Gebru, Timnit, et al. "Datasheets for datasets." <em>Communications of the ACM</em> 64.12 (2021): 86-92.
提供机构:
Zenodo
创建时间:
2023-01-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作