Performance of ENQUIRE’s gene normalization algorithm.
收藏Figshare2025-02-11 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Performance_of_ENQUIRE_s_gene_normalization_algorithm_Precision_recall_and_their_harmonic_mean_F1_are_based_on_479_abstracts_from_the_NLM-Gene_corpus_containing_at_least_one_mention_to_a_i_H_sapiens_i_or_i_M_musculus_i_gene_Different_gene_n/28394502
下载链接
链接失效反馈官方服务:
资源简介:
Precision, recall, and their harmonic mean (F1) are based on 479 abstracts from the NLM-Gene corpus containing at least one mention to a H. sapiens or M. musculus gene. Different gene normalization methods were evaluated by adding or removing filters for excluding predicted cell entities (en_ner_jnlpba_md) and ambiguous abbreviation-definition pairs (Schwartz-Hearst). Gene mentions contained in cell entities such as “CD8+ T cell” are true positives in the NLM-Gene corpus. Text spans tagged as cell entities by the en_ner_jnlpba model are removed without being processed by the tokenizer module. Maximum RAM usage is measured as resident set size (RSS). Estimated time in seconds per abstract (sec/abstract) also accounts for loading the gene alias lookup table and machine learning models. The best values for each parameter setting are highlighted in bold.
创建时间:
2025-02-11



