Mol-Instructions-OOD
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/IDEA-XL/RAPM
下载链接
链接失效反馈官方服务:
资源简介:
该数据集旨在评估多种方法论在蛋白质理解任务中的应用,包括蛋白质功能、功能描述、域/基序以及催化活性等方面。数据集的组织方式避免了数据泄露的问题,并且其评估采用了标准的自然语言处理指标以及一种提议的实体BLEU指标。该任务的目的是蛋白质的理解与生成。
This dataset is designed to evaluate the performance of diverse methodologies across protein understanding tasks, including protein function, functional annotation, domains/motifs, and catalytic activity. The dataset is constructed to mitigate data leakage issues, and its evaluation employs standard natural language processing (NLP) metrics as well as a proposed entity-level BLEU metric. The core objective of this task is protein understanding and generation.



