five

CoCoScore Supplementary Data v1.0

收藏
NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/CoCoScore_Supplementary_Data_v1_0/7198280
下载链接
链接失效反馈
官方服务:
资源简介:
Supplementary Data: CoCoScore: Context-aware co-occurrence scoring for text mining applications using distant supervision # Text mining dictionaries The entities file (entities.tsv.gz), names file (names.tsv.gz), and groups file (groups.tsv.gz) were used to identify proteins/genes, diseases, and tissues in the PubMed + PMC corpus.Please check the following README for usage of these files: https://bitbucket.org/larsjuhljensen/tagger/src/default/README.md # Datasets and pre-trained sentence classification models ## H. sapiens disease-gene associations Training dataset: dataset_9606_-26_train.tsv.gzTest dataset: dataset_9606_-26_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_9606_-26.ftz ## H. sapiens tissue-gene associations Training dataset: dataset_9606_-25_train.tsv.gzTest dataset: dataset_9606_-25_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_9606_-25.ftz ## H. sapiens functional protein-protein associations Training dataset: dataset_9606_9606_train.tsv.gzTest dataset: dataset_9606_9606_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_9606_9606.ftz ## D. melanogaster functional protein-protein associations Training dataset: dataset_7227_7227_train.tsv.gzTest dataset: dataset_7227_7227_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_7227_7227.ftz ## S. cerevisiae functional protein-protein associations Training dataset: dataset_4932_4932_train.tsv.gzTest dataset: dataset_4932_4932_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_4932_4932.ftz ## H. sapiens physical protein-protein interactions Training dataset: dataset_9606_9606_train_physical.tsv.gzTest dataset: dataset_9606_9606_test_physical.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_9606_9606_physical.ftz ## D. melanogaster physical protein-protein interactions Training dataset: dataset_7227_7227_train_physical.tsv.gzTest dataset: dataset_7227_7227_test_physical.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_7227_7227_physical.ftz ## S. cerevisiae physical protein-protein interactions Training dataset: dataset_4932_4932_train_physical.tsv.gzTest dataset: dataset_4932_4932_test_physical.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_4932_4932_physical.ftz # Pre-trained word embeddings Pre-trained fastText word embeddings can be found in: fasttext_sg_masked_dim_300_epoch_5_lr_0.05_minn_3_maxn_6_ws_5.vec.gz
创建时间:
2018-10-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作