five

SciDocs-Aspire

收藏
DataCite Commons2022-03-26 更新2024-07-29 收录
下载链接:
https://figshare.com/articles/dataset/SciDocs-Aspire/19425533
下载链接
链接失效反馈
官方服务:
资源简介:
This is a copy of the SciDocs dataset used in the paper "Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity" by Sheshera Mysore, Arman Cohan, Tom Hope. The dataset was originally released here: https://github.com/allenai/scidocs. The original dataset consists of 7 subtasks, 4 of which are used in our paper. The released data here corresponds to these tasks: Citation prediction, Co-citation prediction, Co-Read prediction, Co-View prediction.<br><br>See further details of the paper, how this dataset was compiled, and how it was used: https://github.com/allenai/aspire<br>The contents of the dataset are as follows: <br><code>abstracts-</code><code><code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code></code>.jsonl</code>: <code>jsonl</code> file containing the paper-id, abstracts, and titles for the queries and candidates which are part of the dataset. <br><br> <code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code>-queries-release.csv</code>: Metadata associated with every query.<code><br></code><code>test-pid2anns-</code><code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code></code><code>.json</code>: JSON file with the query paper-id, candidate paper-ids for every query paper in the dataset. Use these files in conjunction with <code>abstracts-</code><code><code><code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code></code></code>.jsonl</code> to generate files for use in model evaluation. <br><code><code><code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code></code></code>-evaluation_splits.json</code>: Paper-ids for the splits to use in reporting evaluation numbers. <code>aspire/src/evaluation/ranking_eval.py</code> included in the github repo accompanying this dataset implements the evaluation protocol and computes evaluation metrics. Please see the paper for descriptions of the experimental protocol we recommend to report evaluation metrics.
提供机构:
figshare
创建时间:
2022-03-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作