adambuttrick/arxiv-author-affiliations-latex-extract-inference
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/adambuttrick/arxiv-author-affiliations-latex-extract-inference
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含与学术论文相关的信息,包括arXiv ID、DOI、文件名、标题(含文本和语言)、作者(包括标准作者和预测作者,以及所属机构)以及各种与预测相关的字段,如原始预测、思考预测、解析状态、输入/输出特性和处理指标。此外,还包含与模型运行相关的配置,如模型参数、性能指标和硬件信息。数据集分为四种配置,每种配置都有一个train分割,规模从非常小(5个示例)到较大(987个示例)不等。
The dataset contains information related to academic papers, including arXiv IDs, DOIs, filenames, titles (with text and language), authors (both gold standard and predicted, with affiliations), and various prediction-related fields such as raw predictions, thinking predictions, parsing status, input/output characteristics, and processing metrics. There are also configurations related to model runs, including model parameters, performance metrics, and hardware information. The dataset is split into four configurations, each with a train split, varying in size from very small (5 examples) to larger (987 examples).
提供机构:
adambuttrick



