SciDocs-Aspire

Name: SciDocs-Aspire
Creator: figshare
Published: 2022-03-26 15:00:53
License: 暂无描述

DataCite Commons2022-03-26 更新2024-07-29 收录

下载链接：

https://figshare.com/articles/dataset/SciDocs-Aspire/19425533

下载链接

链接失效反馈

官方服务：

资源简介：

This is a copy of the SciDocs dataset used in the paper "Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity" by Sheshera Mysore, Arman Cohan, Tom Hope. The dataset was originally released here: https://github.com/allenai/scidocs. The original dataset consists of 7 subtasks, 4 of which are used in our paper. The released data here corresponds to these tasks: Citation prediction, Co-citation prediction, Co-Read prediction, Co-View prediction. See further details of the paper, how this dataset was compiled, and how it was used: https://github.com/allenai/aspire The contents of the dataset are as follows: <code>abstracts-</code><code><code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code></code>.jsonl</code>: <code>jsonl</code> file containing the paper-id, abstracts, and titles for the queries and candidates which are part of the dataset. <code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code>-queries-release.csv</code>: Metadata associated with every query.<code> </code><code>test-pid2anns-</code><code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code></code><code>.json</code>: JSON file with the query paper-id, candidate paper-ids for every query paper in the dataset. Use these files in conjunction with <code>abstracts-</code><code><code><code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code></code></code>.jsonl</code> to generate files for use in model evaluation. <code><code><code><code>{scidcite/scidcocite/scidcoread/scidcoview}</code></code></code>-evaluation_splits.json</code>: Paper-ids for the splits to use in reporting evaluation numbers. <code>aspire/src/evaluation/ranking_eval.py</code> included in the github repo accompanying this dataset implements the evaluation protocol and computes evaluation metrics. Please see the paper for descriptions of the experimental protocol we recommend to report evaluation metrics.

提供机构：

figshare

创建时间：

2022-03-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集