five

akkp69000

收藏
Mendeley Data2024-03-27 更新2024-06-26 收录
下载链接:
https://data.mendeley.com/datasets/rjhh6cm55z
下载链接
链接失效反馈
官方服务:
资源简介:
Context Scientific papers, as well as other types of documents, can be identified by a set of keywords. Typically, authors are free to choose their Keywords. When authors decide keywords, we called them Authors' Keywords (AK) Sometimes, keywords are imposed, limited or infered by using algorithms. KeyWordsPlus © (KP) try to use information from the bibliographic references of an article to infere keywords. Content This dataset contains information about AK and KP of 69.000 articles. All the articles have been retrieved from Web of Science (WOS): https://www.webofknowledge.com The data is splitted into three different collections: Raw: The raw data, as comes from WoS. Document are distributed over CSV documents. Columns "DE" and "ID" referers to AK and KP, respectively. filtered: We've removed all the articles which don't contain information about AK and KP at the same time. pre_processed: We have cleaned keywords to remove special character, and we have lowercased and stemmed all the keywords. In filtered and pre_processed, you will find two text documents: "ak.txt" and "kp.txt", every line of these documents referers to the same article. So for example, the article number 8 have the following keywords: AK: Automated knowledge assessment; concept map; linking phrase; semantic analysis KP: SCIENCE After pre-processing, the article number 8 have the following keywords: - AK: automknowledgassess;conceptmap;linkphrase;semant_analysi - KP: scienc Acknowledgements We want to thank Web of Science for giving access to it's database.
创建时间:
2024-01-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作