five

peptides_hemolytic

收藏
魔搭社区2025-12-05 更新2025-05-31 收录
下载链接:
https://modelscope.cn/datasets/jablonkagroup/peptides_hemolytic
下载链接
链接失效反馈
官方服务:
资源简介:
## Dataset Details ### Dataset Description Hemolysis is referred to the disruption of erythrocyte membranes that decrease the life span of red blood cells and causes the release of Hemoglobin. It is critical to identify non-hemolytic antimicrobial peptides as a non-toxic and safe measure against bacterial infections. However, distinguishing between hemolytic and non-hemolytic peptides is a challenge, since they primarily exert their activity at the charged surface of the bacterial plasma membrane. The data here comes from the Database of Antimicrobial Activity and Structure of Peptides (DBAASP v3). Hemolytic activity is defined by extrapolating a measurement assuming dose response curves to the point at which 50% of red blood cells are lysed. Activities below 100 mu g/ml, are considered hemolytic. The data contains sequences of only L- and canonical amino acids. Each measurement is treated independently, so sequences can appear multiple times. This experimental dataset contains noise, and in some observations (40%), an identical sequence appears in both negative and positive class. As an example, sequence "RVKRVWPLVIRTVIAGYNLYRAIKKK" is found to be both hemolytic and non-hemolytic in two different lab experiments (i.e. two different training examples). - **Curated by:** - **License:** CC BY 4.0 ### Dataset Sources - [corresponding publication](https://doi.org/10.1021/acs.jcim.2c01317) - [data source](https://doi.org/10.1093/nar/gkaa991) ## Citation <!-- If there is a paper or blog post introducing the dataset, the APA and Bibtex information for that should go in this section. --> **BibTeX:** ```bibtex @article{Martins2012, doi = {10.1021/ci300124c}, url = {https://doi.org/10.1021/ci300124c}, year = {2012}, month = jun, publisher = {American Chemical Society (ACS)}, volume = {52}, number = {6}, pages = {1686--1697}, author = {Ines Filipa Martins and Ana L. Teixeira and Luis Pinheiro and Andre O. Falcao}, title = {A Bayesian Approach to in Silico Blood-Brain Barrier Penetration Modeling}, journal = {Journal of Chemical Information and Modeling} @article{Wu2018, doi = {10.1039/c7sc02664a}, url = {https://doi.org/10.1039/c7sc02664a}, year = {2018}, publisher = {Royal Society of Chemistry (RSC)}, volume = {9}, number = {2}, pages = {513--530}, author = {Zhenqin Wu and Bharath Ramsundar and Evan~N. Feinberg and Joseph Gomes and Caleb Geniesse and Aneesh S. Pappu and Karl Leswing and Vijay Pande}, title = {MoleculeNet: a benchmark for molecular machine learning}, journal = {Chemical Science} ```

## 数据集详情 ### 数据集描述 溶血(Hemolysis)指红细胞膜破裂,会缩短红细胞寿命并引发血红蛋白释放。鉴定非溶血性抗菌肽,将其作为对抗细菌感染的无毒安全手段,具有重要意义。然而,区分溶血性与非溶血性肽类是一项挑战,因为二者的活性主要作用于细菌质膜的带电表面。 本数据集的数据来源于抗菌肽活性与结构数据库(Database of Antimicrobial Activity and Structure of Peptides, DBAASP v3)。溶血性活性的定义为:基于剂量反应曲线外推测量值,拟合至50%红细胞被裂解的剂量点;当活性低于100 μg/mL时,即判定为溶血性。 本数据集仅包含L型及标准氨基酸序列。每条测量记录均独立处理,因此同一序列可能多次出现。该实验数据集存在噪声,且在40%的观测样本中,同一序列同时出现在负类与正类标签中。例如,序列"RVKRVWPLVIRTVIAGYNLYRAIKKK"在两项不同的实验室实验中(即两个独立的训练样本)被分别标注为溶血性与非溶血性。 - **整理者:** - **许可协议:** CC BY 4.0 ### 数据集来源 - [对应发表论文](https://doi.org/10.1021/acs.jcim.2c01317) - [数据源](https://doi.org/10.1093/nar/gkaa991) ## 引用信息 <!-- 若有介绍该数据集的论文或博客文章,需在此处附上其APA及BibTeX格式引用信息。 --> **BibTeX:** bibtex @article{Martins2012, doi = {10.1021/ci300124c}, url = {https://doi.org/10.1021/ci300124c}, year = {2012}, month = jun, publisher = {美国化学会(ACS)}, volume = {52}, number = {6}, pages = {1686--1697}, author = {Ines Filipa Martins and Ana L. Teixeira and Luis Pinheiro and Andre O. Falcao}, title = {血脑屏障穿透性的贝叶斯计算机模拟建模方法}, journal = {Journal of Chemical Information and Modeling} @article{Wu2018, doi = {10.1039/c7sc02664a}, url = {https://doi.org/10.1039/c7sc02664a}, year = {2018}, publisher = {英国皇家化学会(RSC)}, volume = {9}, number = {2}, pages = {513--530}, author = {Zhenqin Wu and Bharath Ramsundar and Evan~N. Feinberg and Joseph Gomes and Caleb Geniesse and Aneesh S. Pappu and Karl Leswing and Vijay Pande}, title = {MoleculeNet:分子机器学习基准数据集}, journal = {Chemical Science}
提供机构:
maas
创建时间:
2025-05-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作