STEM-NER-60k
收藏DataCite Commons2022-05-24 更新2024-07-13 收录
下载链接:
https://data.uni-hannover.de/dataset/f1c21fba-f548-4223-816b-e5c3e70dc75e
下载链接
链接失效反馈官方服务:
资源简介:
##A Large-scale Dataset of STEM Science as PROCESS, METHOD, MATERIAL, and DATA Named Entities ###This repository hosts data as a follow-up study to the following publications D'Souza, J., Hoppe, A., Brack, A., Jaradeh, M., Auer, S., & Ewerth, R. (2020). [The STEM-ECR Dataset: Grounding Scientific Entity References in STEM Scholarly Content to Authoritative Encyclopedic and Lexicographic Sources.](https://aclanthology.org/2020.lrec-1.268/) In Proceedings of The 12th Language Resources and Evaluation Conference (pp. 2192–2203). European Language Resources Association. Brack, A., D’Souza, J., Hoppe, A., Auer, S., Ewerth, R. (2020). [Domain-Independent Extraction of Scientific Concepts from Research Articles](https://doi.org/10.1007/978-3-030-45439-5_17). In: , et al. Advances in Information Retrieval. ECIR 2020. Lecture Notes in Computer Science, vol 12035. Springer, Cham. https://doi.org/10.1007/978-3-030-45439-5_17 Supporting dataset link [https://data.uni-hannover.de/dataset/stem-ecr-v1-0](https://data.uni-hannover.de/dataset/stem-ecr-v1-0) ###Description Roughly 60,000 titles and abstracts of scholarly articles with the CC-BY redistributable license were downloaded from Elsevier. The articles spanned 10 STEM domains which were the most prolific on Elsevier viz., *Agriculture*, *Astronomy*, *Biology*, *Chemistry*, *Computer Science*, *Earth Science*, *Engineering*, *Material Science*, and *Mathematics*. The STEM NER system reported in the publication above was applied on these articles. An automatically extracted dataset of 4 typed entities, viz., *Process*, *Method*, *Material*, and *Data* was created. ### What this repository contains? Aggregated lists of *Process*, *Method*, *Material*, and *Data* entities with respective occurrence counts extracted from 59,984 scholarly publications organized per the 10 STEM domains considered. Additionally, the list of Elsevier CC-BY articles used in this study are provided in the `raw-data` directory of the repository. ###Useful Links * https://github.com/elsevierlabs/OA-STM-Corpus/ * https://orkg.org/orkg
提供机构:
LUIS
创建时间:
2022-05-24



