GrimSqueaker/ProFET_NP_SP_Cleaved
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/GrimSqueaker/ProFET_NP_SP_Cleaved
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: other
tags:
- biology
- proteins
- sequence-classification
- benchmark
task_categories:
- text-classification
pretty_name: ProFET_NP_SP_Cleaved
---
# ProFET_NP_SP_Cleaved
Binary benchmark for neuropeptide precursor prediction from protein sequences, adapted from the ProteinBERT benchmark collection.
## Source
This dataset is sourced from the ProteinBERT benchmark repository:
https://github.com/nadavbra/protein_bert/tree/master/protein_benchmarks
## Curator Attribution
This Hugging Face dataset packaging, curation, and publication was prepared by Dan Ofer.
## Splits and Schema
- Splits follow the benchmark release (train/validation/test when available).
- Each row includes:
- seq: amino-acid sequence
- label: binary target (0 or 1)
## Hugging Face Repo
- GrimSqueaker/ProFET_NP_SP_Cleaved
## Citations
```bibtex
@article{10.1093/bioinformatics/btac020,
author = {Brandes, Nadav and Ofer, Dan and Peleg, Yam and Rappoport, Nadav and Linial, Michal},
title = {ProteinBERT: a universal deep-learning model of protein sequence and function},
journal = {Bioinformatics},
volume = {38},
number = {8},
pages = {2102-2110},
year = {2022},
doi = {10.1093/bioinformatics/btac020}
}
@article{OferD2014,
author = {Ofer, Dan and Linial, Michal},
title = {NeuroPID: a predictor for identifying neuropeptide precursors from metazoan proteomes},
journal = {Bioinformatics},
volume = {30},
number = {7},
pages = {931--940},
year = {2014},
doi = {10.1093/bioinformatics/btt725}
}
@article{Karsenty2014,
author = {Karsenty, S. and Rappoport, N. and Ofer, D. and Zair, A. and Linial, M.},
title = {NeuroPID: a classifier of neuropeptide precursors},
journal = {Nucleic Acids Research},
year = {2014},
doi = {10.1093/nar/gku363}
}
@article{Brandes2016,
author = {Brandes, Nadav and Ofer, Dan and Linial, Michal},
title = {ASAP: A machine learning framework for local protein properties},
journal = {Database},
volume = {2016},
year = {2016},
doi = {10.1093/database/baw133}
}
```
语言:
- 英语
许可证:其他
标签:
- 生物学
- 蛋白质
- 序列分类
- 基准测试
任务类别:
- 文本分类
显示名称:ProFET_NP_SP_Cleaved
# ProFET_NP_SP_Cleaved
本数据集为基于蛋白质序列的神经肽前体预测二分类基准任务,改编自ProteinBERT基准数据集合集。
## 来源
本数据集源自ProteinBERT基准仓库:https://github.com/nadavbra/protein_bert/tree/master/protein_benchmarks
## 编者标注
本Hugging Face数据集的打包、整理与发布由Dan Ofer完成。
## 划分与数据结构
- 数据集划分遵循该基准任务的原始发布格式(若可用则包含训练集、验证集与测试集)。
- 每条数据记录包含以下字段:
- seq:氨基酸序列
- label:二分类目标标签(取值为0或1)
## Hugging Face 仓库
- GrimSqueaker/ProFET_NP_SP_Cleaved
## 引用文献
bibtex
@article{10.1093/bioinformatics/btac020,
author = {Brandes, Nadav and Ofer, Dan and Peleg, Yam and Rappoport, Nadav and Linial, Michal},
title = {ProteinBERT: 一款通用的蛋白质序列与功能深度学习模型},
journal = {Bioinformatics},
volume = {38},
number = {8},
pages = {2102-2110},
year = {2022},
doi = {10.1093/bioinformatics/btac020}
}
@article{OferD2014,
author = {Ofer, Dan and Linial, Michal},
title = {NeuroPID:一种用于从后生动物蛋白质组中识别神经肽前体的预测工具},
journal = {Bioinformatics},
volume = {30},
number = {7},
pages = {931--940},
year = {2014},
doi = {10.1093/bioinformatics/btt725}
}
@article{Karsenty2014,
author = {Karsenty, S. and Rappoport, N. and Ofer, D. and Zair, A. and Linial, M.},
title = {NeuroPID:一款神经肽前体分类器},
journal = {Nucleic Acids Research},
year = {2014},
doi = {10.1093/nar/gku363}
}
@article{Brandes2016,
author = {Brandes, Nadav and Ofer, Dan and Linial, Michal},
title = {ASAP:一种用于分析蛋白质局部性质的机器学习框架},
journal = {Database},
volume = {2016},
year = {2016},
doi = {10.1093/database/baw133}
}
提供机构:
GrimSqueaker



