five

CleverThis/uniprotkb_obsolete_entries_240000000-v1

收藏
Hugging Face2026-01-21 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/CleverThis/uniprotkb_obsolete_entries_240000000-v1
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-generation - feature-extraction language: - en tags: - rdf - knowledge-graph - semantic-web - triples size_categories: - 1K<n<10K --- # uniprotkb_obsolete_entries_240000000 ## Dataset Description Comprehensive protein knowledgebase with functional annotations **Original Source:** https://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/uniprotkb_obsolete_entries_240000000.rdf.xz ### Dataset Summary This dataset contains RDF triples from uniprotkb_obsolete_entries_240000000 converted to HuggingFace dataset format for easy use in machine learning pipelines. - **Format:** Originally rdf, converted to HuggingFace Dataset - **Size:** 0.392 GB (extracted) - **Entities:** ~90M protein entries - **Triples:** ~3.4B - **Original License:** CC BY 4.0 ### Recommended Use Protein research, molecular biology, functional genomics ### Notes High quality with manual curation for Swiss-Prot entries. Updated every 8 weeks. ## RDF Format This dataset uses a standard lossless format for representing RDF triples. Each triple is a row with 6 fields: - `subject`: Subject URI or blank node - `predicate`: Predicate URI - `object`: Object value (URI, literal, or blank node) - `object_type`: Type of object (`uri`, `literal`, or `blank_node`) - `object_datatype`: XSD datatype URI (for typed literals) - `object_language`: Language tag (for language-tagged literals) ### Loading the Dataset ```python from datasets import load_dataset dataset = load_dataset("uniprotkb_obsolete_entries_240000000") for row in dataset["train"]: print(f"{row['subject']} {row['predicate']} {row['object']}") ``` ## Citation If you use this dataset, please cite the original source: **Dataset:** uniprotkb_obsolete_entries_240000000 **URL:** https://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/uniprotkb_obsolete_entries_240000000.rdf.xz **License:** CC BY 4.0 ## Conversion Details - **Converted using:** [RDF to HuggingFace Incremental Converter](https://github.com/CleverThis/cleverernie) - **Conversion date:** 2026-01-21 - **Format version:** 1.0 --- This dataset is part of the CleverThis knowledge graph collection.
提供机构:
CleverThis
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作