Words_Selected_by_Information_Gain
收藏DataCite Commons2020-09-20 更新2025-04-16 收录
下载链接:
https://databank.illinois.edu/datasets/IDB-9837167
下载链接
链接失效反馈官方服务:
资源简介:
File Name: WordsSelectedByInformationGain.csv
Data Preparation: Xiaoru Dong, Linh Hoang
Date of Preparation: 2018-12-12
Data Contributions: Jingyi Xie, Xiaoru Dong, Linh Hoang
Data Source: Cochrane systematic reviews published up to January 3, 2018 by 52 different Cochrane groups in 8 Cochrane group networks.
Associated Manuscript authors: Xiaoru Dong, Jingyi Xie, Linh Hoang, and Jodi Schneider.
Associated Manuscript, Working title: Machine classification of inclusion criteria from Cochrane systematic reviews.
Description: the file contains a list of 1655 informative words selected by applying information gain feature selection strategy.
Information gain is one of the methods commonly used for feature selection, which tells us how many bits of information the presence of the word are helpful for us to predict the classes, and can be computed in a specific formula [Jurafsky D, Martin JH. Speech and language processing. London: Pearson; 2014 Dec 30].We ran Information Gain feature selection on Weka -- a machine learning tool.
Notes: In order to reproduce the data in this file, please get the code of the project published on GitHub at: https://github.com/XiaoruDong/InclusionCriteria and run the code following the instruction provided.
提供机构:
University of Illinois at Urbana-Champaign
创建时间:
2018-12-20



