Variantscape datasets
收藏Zenodo2025-04-23 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.15268055
下载链接
链接失效反馈官方服务:
资源简介:
Variantscape datasetLLM-based extraction of genetic variants and biomedical entities from titles and abstracts of biomedical publications. These datasets support the analysis of literature-derived co-associations between genetic variants, cancer types, and treatments, enabling downstream network analysis, hypothesis generation, and discovery in precision oncology.
1. Dataset: Cleaned literature dataset for biomedical entity extraction (2014–2024)"cleaned_OpenAlex.csv "A pre-processed, cleaned, and structured dataset of cancer-related biomedical publications (2014–2024) retrieved from OpenAlex, containing titles, abstracts, and metadata curated for downstream NLP and LLM-based biomedical entity extraction.
2. Dataset: Binary entity matrix for co-association and network analysis"dataset_for_analysis.csv"Final binary matrix dataset derived from NLP- and LLM-based entity extraction on cancer-related literature. Entities include genetic variants, cancer types, and treatments, enabling co-occurrence and network analysis, and the investigation of literature-derived co-associations.
3. Dataset: LLM-based classification of variant-treatment co-associations"variant_treatment_relationship_consensus.csv"Dataset capturing LLM-based classification and consensus on co-associations between genetic variants and treatments.
4. Dataset: Metadata mapping for entity extraction and analysis"metadata_mapping_transposed.csv "Transposed, row-indexed metadata mapping file used for identification of each column as a variant, cancer type, treatment, study design element, or publication-derived metadata.
提供机构:
Zenodo
创建时间:
2025-04-23



