genomenet/esm2-uniref50-faiss
收藏Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/genomenet/esm2-uniref50-faiss
下载链接
链接失效反馈官方服务:
资源简介:
ESM2 UniRef50 FAISS索引是基于ESM2模型(esm2_t33_650M_UR50D)对GO注释的UniRef50蛋白质的平均池化嵌入构建的FAISS索引。该索引用于在genomenet/functional-distance空间中进行最近邻搜索。数据集包含三个文件:esm2_uniref50.index(FAISS索引文件,使用OPQ + IVF + PQ方法,支持余弦/内积在L2归一化向量上的计算)、ids.npy(与FAISS位置对齐的UniRef50聚类ID,数据类型为S24)和metadata.json(构建参数,包括维度、工厂方法、nprobe、向量数量等)。
FAISS index over ESM2 (`esm2_t33_650M_UR50D`) mean-pooled embeddings of GO-annotated UniRef50 proteins. Used by the genomenet/functional-distance Space for nearest-neighbor search. The dataset includes three files: esm2_uniref50.index (FAISS index file, using OPQ + IVF + PQ method, supporting cosine / inner product on L2-normalized vectors), ids.npy (UniRef50 cluster IDs aligned with FAISS positions, dtype=S24), and metadata.json (build parameters including dim, factory, nprobe, n_vectors, etc.).
提供机构:
genomenet



