five

ChronoMedKG: A Temporally-Grounded, Evidence-Graded Biomedical Knowledge Graph and Benchmark for Temporal Clinical Reasoning

收藏
DataCite Commons2026-05-03 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19697542
下载链接
链接失效反馈
官方服务:
资源简介:
ChronoMedKG is a temporally-grounded, evidence-graded biomedical knowledge graph built by running a four-agent disease-autonomous pipeline across 13,431 of PrimeKG's 17,080 diseases (78.6%). The pipeline yields 460,497 validated consensus triples out of 13 million extracted triples; 10,852 diseases produce surviving triples after multi-LLM consensus and Quality Controller filtering. Every edge carries temporal metadata (per-phenotype onset windows, progression stages, clinical milestones), PMID-traceable evidence text, and a six-signal credibility score. Unlike static biomedical KGs (PrimeKG, iKraph, Hetionet) that treat associations as timeless, ChronoMedKG records WHEN in a disease course each fact applies. The resource adds onset data for 6,250 diseases not present in any reference resource (HPOA, Orphadata, Phenopackets), 1,657 of them Orphanet-coded rare diseases gaining first-time structured onset representation. Validation against Orphadata reaches 92.7%; a three-LLM judge-panel audit on 100 novel-coverage diseases reaches 87.9%. Construction uses a disease-autonomous four-agent pipeline (Disease Profiler, Evidence Harvester, Knowledge Extractor, Quality Controller) that runs end-to-end from a disease identifier. Multiple frontier LLMs extract triples in parallel; only relations supported by multi-model consensus survive credibility filtering and PrimeKG schema alignment. Total construction cost across 13,431 diseases: ~$2,400 in LLM API spend. ChronoMedKG ships paired with ChronoTQA, the first temporal biomedical QA benchmark: 3,341 questions across eight reported task types plus a 12-question supplementary HPOA negative-temporal MCQ probe. Frontier LLMs trail their static-question accuracy by ~30 points on temporal items, and selective retrieval against ChronoMedKG rescues 47-65% of failed long-tail queries (vs 17-29% for HPOA-RAG). This deposit (v0.0.1) contains:- validated_triples.jsonl (Gold, 527 MB, 460,497 rows): main product, post-QC- consensus_triples.jsonl.gz (Silver, 30 MB): pre-QC consensus rows- raw_triples.jsonl.gz (Bronze, 644 MB): full extraction log, 13M rows- tqa_benchmark.json (3.2 MB): ChronoTQA, 3,341 questions- pmc_clinical_cases.json (63 KB): 31 diagnostic-odyssey case reports- novelty_multi_judge_v2.json (168 KB): three-LLM audit verdicts- croissant.json: Croissant 1.0 ML metadata- README.md, LICENSE-DATA, NOTICE
提供机构:
Zenodo
创建时间:
2026-04-22
二维码
社区交流群
二维码
科研交流群
商业服务