Assessing Word Similarity Metrics for Traceability Link Recovery - Evaluation Dataset
收藏Mendeley Data2024-05-10 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/6580280
下载链接
链接失效反馈官方服务:
资源简介:
This dataset includes all data that was used for the evaluation of my bachelor's thesis: Assessing Word Similarity Metrics for Traceability Link Recovery The following files correspond to the following data sets from the evaluation: cc-en-300.tar.gz corresponds to fastText's cc.en.300.bin embedding crawl-300d-2M-subword.tar.gz corresponds to fastText's crawl-300d-2M-subword.bin embedding wiki-news-300d-1M-subword.tar.gz corresponds to fastText's wiki-news-300d-1M-subword.bin embedding wordnet.tar.gz corresponds to the WordNet 3.1 semantic network sewordsim.tar.gz corresponds to SEWordSimDB's vector similarity database glove_cc_840B_300d.tar.gz corresponds to GloVe's CC vector embedding glove_wikigiga_300d.tar.gz corresponds to GloVe's 300 dimensional WIGI vector embedding glove_wikigiga_200d.tar.gz corresponds to GloVe's 200 dimensional WIGI vector embedding glove_wikigiga_100d.tar.gz corresponds to GloVe's 100 dimensional WIGI vector embedding glove_wikigiga_50d.tar.gz corresponds to GloVe's 50 dimensional WIGI vector embedding glove_twitter_200d.tar.gz corresponds to GloVe's 200 dimensional TWTR vector embedding glove_twitter_100d.tar.gz corresponds to GloVe's 100 dimensional TWTR vector embedding glove_twitter_50d.tar.gz corresponds to GloVe's 50 dimensional TWTR vector embedding glove_twitter_25d.tar.gz corresponds to GloVe's 25 dimensional TWTR vector embedding eval_results.tar.gz contains the detailed evaluation results for each configuration of all measures The licenses of all data sets are included in their respective files. Some of these data sets are .sql files. To use these files to reproduce the evaluation, they need to be imported into a sqlite3 database. The version of ArDoCo used for the evaluation is only able to work with sqlite3 databases and not with sql files.
创建时间:
2023-06-28



