Code used to produce terms list in the work "NLP-Driven Electron Microscopy Ontology Development"
收藏NIST Chemistry WebBook2025-07-09 更新2026-03-14 收录
下载链接:
https://data.nist.gov/od/id/mds2-3198
下载链接
链接失效反馈官方服务:
资源简介:
This is a collection of code written by Maurice Curran that was used to process the Microscopy and Microanalysis conference proceeding corpus into word products described in the publication "NLP-Driven Electron Microscopy Ontology Development". The scripts are written in Python, to be used in the following order:
1. SettingUpTextFiles.py and CopyingText.py to get the raw text files;
2. SentenceConversion.py;
3. reference_remover.py;
4. testing.py and testingavg.py;
5. SentenceCreator.py;
6. matscholar_model.py to get matscholar tags;
7. training_model_gensim.py to get gensim model;
8. word2vecscript.py and gensim_visual.py;



