CleverThis/uniprotkb_reviewed_archea_methanobacteriati_3366610_0-v1
收藏Hugging Face2025-12-30 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/CleverThis/uniprotkb_reviewed_archea_methanobacteriati_3366610_0-v1
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个全面的蛋白质知识库,包含功能注释。原始数据来源于UniProt数据库,以RDF三元组形式提供,并转换为HuggingFace数据集格式以便于机器学习流程使用。数据集包含约90M蛋白质条目和约3.4B三元组,原始大小为0.392 GB(解压后),采用CC BY 4.0许可。推荐用于蛋白质研究、分子生物学和功能基因组学。数据集采用标准无损格式表示RDF数据,保留了原始RDF知识图谱的所有语义信息,支持与HuggingFace格式之间的完美往返转换。
This dataset is a comprehensive protein knowledgebase with functional annotations. The original data comes from the UniProt database, provided as RDF triples and converted to HuggingFace dataset format for easy use in machine learning pipelines. The dataset contains approximately 90M protein entries and ~3.4B triples, with an original size of 0.392 GB (extracted), licensed under CC BY 4.0. Recommended for protein research, molecular biology, and functional genomics. The dataset uses a standard lossless format for representing RDF data, preserving all semantic information from the original RDF knowledge graph, enabling perfect round-trip conversion between RDF and HuggingFace formats.
提供机构:
CleverThis



