InstaDeepAI/ChatNT_training_data
收藏Hugging Face2025-07-02 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/InstaDeepAI/ChatNT_training_data
下载链接
链接失效反馈官方服务:
资源简介:
ChatNT训练数据集是一个经过精心策划的基因组学指令任务集合,旨在训练一个统一的模型,通过自然语言处理广泛的生物序列分析任务。该数据集将27个不同的基因组学任务转化为指导性指令格式,每个实例包含一个生物序列(DNA)和相应的英语问题及其真实答案。这种格式使得可以进行“基因组学指导微调”,使模型能够以对话方式学习执行多样和复杂的生物预测。
The ChatNT training dataset is a curated collection of genomics instruction tasks designed to train a single, unified model to handle a wide variety of biological sequence analysis tasks through natural language. The dataset reframes 27 distinct genomics tasks into an instruction-following format, with each instance consisting of a biological sequence (DNA) paired with a corresponding English question and its ground-truth answer.
提供机构:
InstaDeepAI



