five

Benchmark dataset for CATH hierarchical clustering tools (GeMMA/FunFHMMEr, MARC, FRAN and eMMA)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/8425747
下载链接
链接失效反馈
官方服务:
资源简介:
Benchmark dataset for CATH SuperFamily 3.40.50.620 (HUPS). Contains Functional Families alignments and Hidden Markov Models generated by GeMMA/FunFHMMER, MARC, FRAN and CATH-eMMA and Python code used to assess their quality (EC purity, DOPS, Neff) and intermediate steps by the MARC and FRAN pipelines (pooling, randomisation, renaming). 3.4.50.620_full_superfamily_sequences.fasta contains all HUPs superfamily sequences, the FunFams are a subset of these. all_starting_clusters_sequences.fasta contain the sequences included in the starting clusters used in the analyses. 3.40.50.620_embedded.pt includes embeddings for the HUPs superfamily generated using the ESM2 Protein Language Model.
创建时间:
2024-06-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作