jimmyzxj/massw
收藏Hugging Face2024-12-08 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/jimmyzxj/massw
下载链接
链接失效反馈官方服务:
资源简介:
MASSW是一个关于科学工作流程的多方面总结的综合性文本数据集。该数据集包含超过152,000篇同行评审的出版物,涵盖了过去50年内的17个主要计算机科学会议。数据集的核心特征包括结构化科学工作流程、大规模、准确性和丰富的基准任务。数据集的核心方面包括背景、关键思想、方法、结果和预期影响。此外,数据集还涵盖了多个计算机科学领域的会议,并提供了引用信息。
MASSW is a comprehensive text dataset on Multi-Aspect Summarization of Scientific Workflows. It includes more than 152,000 peer-reviewed publications from 17 leading computer science conferences spanning the past 50 years. The dataset defines five core aspects of a scientific workflow: context, key idea, method, outcome, and projected impact. These aspects align with the typical stages in scientific workflows identified in recent literature. MASSW systematically extracts and structures these five aspects from each publication using Large Language Models (LLMs). The coverage and accuracy of MASSW have been validated through comprehensive inspections and comparisons with human annotations and alternative methods. MASSW supports multiple novel and benchmarkable machine learning tasks, such as idea generation and outcome prediction, serving as a benchmark for evaluating LLM agents ability to navigate scientific research.
提供机构:
jimmyzxj



