opsomerto/mini-protein-dataset
收藏Hugging Face2026-04-05 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/opsomerto/mini-protein-dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: cc-by-4.0
tags:
- protein
- biology
- uniprot
- swiss-prot
---
# Mini Protein Dataset
A small subset of reviewed (Swiss-Prot) protein sequences from UniProt,
created for the `minihf` toy project demonstrating a full HuggingFace workflow.
## Fields
| Field | Type | Description |
|---|---|---|
| `id` | string | UniProt accession (e.g. `P12345`) |
| `description` | string | Protein name from FASTA header |
| `sequence` | string | Amino-acid sequence (single-letter codes) |
| `length` | int | Sequence length in amino acids |
## Usage
```python
from datasets import load_dataset
ds = load_dataset("opsomerto/mini-protein-dataset")
print(ds["train"][0]["sequence"])
```
## Source
UniProt Swiss-Prot (reviewed), fetched via the UniProt REST API.
提供机构:
opsomerto



