genomenet/twin-uniref50-faiss
收藏Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/genomenet/twin-uniref50-faiss
下载链接
链接失效反馈官方服务:
资源简介:
这是一个基于Twin模型的UniRef50蛋白质FAISS索引数据集。Twin模型是一个双塔对比编码器,通过Resnik GO相似性进行微调,包含自定义塔和ESM塔,最终输出1024维的嵌入向量。数据集包含FAISS索引文件、UniRef50聚类ID文件和元数据文件,用于蛋白质功能距离计算。
This is a FAISS index dataset of UniRef50 protein embeddings based on the Twin model. The Twin model is a two-tower contrastive encoder fine-tuned on Resnik GO similarity, consisting of a custom tower and an ESM tower, producing 1024-dimensional embeddings. The dataset includes FAISS index files, UniRef50 cluster ID files, and metadata files, used for protein functional distance calculation.
提供机构:
genomenet



