five

Synthetic datasets for end-to-end Relation Extraction of relationships between Organisms and Natural-Products

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8422293
下载链接
链接失效反馈
官方服务:
资源简介:
Synthetic datasets (training/validation) for end-to-end Relation Extraction of relationships between Organisms and Natural-Products. The datasets are provided for reproducibility purposes, but, can also be used to train new models. As in the corresponding article, 3 subtypes of synthetic datasets are provided: Diversity-synt: The seed literature references used in the generation process correspond to the top-500 extracted items per biological kingdoms using the GME-sampler. Random-synt: 5 datasets of equivalent sizes as Diversity-synt, but using randomly sampled seed literature references. Extended-synt: A merge of Diversity-synt and the 5 Random-synt datasets.  All datasets were produced with Vicuna-13b-v1.3. Like the model, the produced synthetic data are also submitted to the License of the model used for generation, see the original LLaMA model card. LLaMA is licensed under the LLaMA License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.
创建时间:
2023-11-14
二维码
社区交流群
二维码
科研交流群
商业服务