five

Variantscape datasets

收藏
Zenodo2025-04-23 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.15268055
下载链接
链接失效反馈
官方服务:
资源简介:
Variantscape datasetLLM-based extraction of genetic variants and biomedical entities from titles and abstracts of biomedical publications. These datasets support the analysis of literature-derived co-associations between genetic variants, cancer types, and treatments, enabling downstream network analysis, hypothesis generation, and discovery in precision oncology.   1. Dataset: Cleaned literature dataset for biomedical entity extraction (2014–2024)"cleaned_OpenAlex.csv "A pre-processed, cleaned, and structured dataset of cancer-related biomedical publications (2014–2024) retrieved from OpenAlex, containing titles, abstracts, and metadata curated for downstream NLP and LLM-based biomedical entity extraction.   2. Dataset: Binary entity matrix for co-association and network analysis"dataset_for_analysis.csv"Final binary matrix dataset derived from NLP- and LLM-based entity extraction on cancer-related literature. Entities include genetic variants, cancer types, and treatments, enabling co-occurrence and network analysis, and the investigation of literature-derived co-associations.   3. Dataset: LLM-based classification of variant-treatment co-associations"variant_treatment_relationship_consensus.csv"Dataset capturing LLM-based classification and consensus on co-associations between genetic variants and treatments.   4. Dataset: Metadata mapping for entity extraction and analysis"metadata_mapping_transposed.csv "Transposed, row-indexed metadata mapping file used for identification of each column as a variant, cancer type, treatment, study design element, or publication-derived metadata.
提供机构:
Zenodo
创建时间:
2025-04-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作