CoCoScore Supplementary Data v1.0
收藏DataCite Commons2025-04-01 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/CoCoScore_Supplementary_Data_v1_0/7198280/1
下载链接
链接失效反馈官方服务:
资源简介:
Supplementary Data: CoCoScore: Context-aware co-occurrence scoring for text mining applications using distant supervision<br># Text mining dictionaries<br>The entities file (entities.tsv.gz), names file (names.tsv.gz), and groups file (groups.tsv.gz) were used to identify proteins/genes, diseases, and tissues in the PubMed + PMC corpus.Please check the following README for usage of these files: https://bitbucket.org/larsjuhljensen/tagger/src/default/README.md<br># Datasets and pre-trained sentence classification models<br>## H. sapiens disease-gene associations<br>Training dataset: dataset_9606_-26_train.tsv.gzTest dataset: dataset_9606_-26_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_9606_-26.ftz<br>## H. sapiens tissue-gene associations<br>Training dataset: dataset_9606_-25_train.tsv.gzTest dataset: dataset_9606_-25_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_9606_-25.ftz<br>## H. sapiens functional protein-protein associations<br>Training dataset: dataset_9606_9606_train.tsv.gzTest dataset: dataset_9606_9606_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_9606_9606.ftz<br>## D. melanogaster functional protein-protein associations<br>Training dataset: dataset_7227_7227_train.tsv.gzTest dataset: dataset_7227_7227_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_7227_7227.ftz<br>## S. cerevisiae functional protein-protein associations<br>Training dataset: dataset_4932_4932_train.tsv.gzTest dataset: dataset_4932_4932_test.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_4932_4932.ftz<br>## H. sapiens physical protein-protein interactions<br>Training dataset: dataset_9606_9606_train_physical.tsv.gzTest dataset: dataset_9606_9606_test_physical.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_9606_9606_physical.ftz<br>## D. melanogaster physical protein-protein interactions<br>Training dataset: dataset_7227_7227_train_physical.tsv.gzTest dataset: dataset_7227_7227_test_physical.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_7227_7227_physical.ftz<br>## S. cerevisiae physical protein-protein interactions<br>Training dataset: dataset_4932_4932_train_physical.tsv.gzTest dataset: dataset_4932_4932_test_physical.tsv.gzfastText Sentence classification models trained on training dataset: ft_model_CoCoScore_pretrained_4932_4932_physical.ftz<br># Pre-trained word embeddings<br>Pre-trained fastText word embeddings can be found in: fasttext_sg_masked_dim_300_epoch_5_lr_0.05_minn_3_maxn_6_ws_5.vec.gz
提供机构:
figshare
创建时间:
2018-10-13



