INDRA assembly Benchmark Corpus

NIAID Data Ecosystem2026-03-14 收录

下载链接：

https://zenodo.org/record/7275057

下载链接

链接失效反馈

官方服务：

资源简介：

This data set accompanies the manuscript "Automated assembly of molecular mechanisms at scale from text mining and curated databases" which describes assembly methodology implemented in the INDRA system (https://github.com/sorgerlab/indra). The manuscript uses an example assembly pipeline on ~570k publications as input to create the INDRA Benchmark Corpus. This dataset provides INDRA Statements constituting the INDRA Benchmark Corpus as well as a set of curations on the corpus: - indra_benchmark_corpus.pkl: A Python pickle file of INDRA Statement objects. It requires INDRA to be installed to load in a Python environment. - indra_benchmark_corpus.json.gz: A gzipped JSON export of INDRA Statements. - indra_assembly_curations.json: The Curated Corpus of curated mentions for Statements in Benchmark Corpus as a JSON file. The JSON file contains a list with each element corresponding to a curation. Each curation entry contains a `pa_hash` and a `source_hash` attribute. These can be used to find the Statement and mention (respectively) to which the curation applies in the Benchmark Corpus.

创建时间：

2023-01-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集