five

ekacare/BODHI-M

收藏
Hugging Face2026-04-13 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/ekacare/BODHI-M
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-4.0 language: - en tags: - medical - knowledge-graph - clinical - healthcare - india - snomed - loinc - graph - drug - lab-investigation - medication pretty_name: BODHI-M — Clinical Concept-Drug-Lab Investigation Knowledge Graph size_categories: - 1K<n<10K --- # BODHI-M — Concept-Drug-Lab Investigation Knowledge Graph Part of **BODHI (Bharat Ontology for Disease & Healthcare Informatics)** — open clinical knowledge graphs for grounding healthcare AI in verified medical facts. → [Full writeup: motivation, design & use cases](https://info.eka.care/services/bodhi-bharat-ontology-for-disease-healthcare-informatics) → [GitHub (all formats: Neo4j, CSV, PyG, RDF)](https://github.com/eka-care/BODHI) --- ## What is BODHI-M? BODHI-M maps SNOMED-coded clinical concepts (disorders, findings, procedures, lifestyle factors) to their generic drug treatments and LOINC-coded lab investigations, organised in a three-level hierarchy: **System → Group → Granular**. Built and validated by expert clinicians at [Eka Care](https://www.eka.care), it has powered production patient health profiling and longitudinal health views across millions of records in India. The three-tier concept hierarchy is a core design choice: when a drug or lab result cannot confidently pinpoint a specific granular disease, the graph supports reliable "soft inference" at the broader System or Group level. This also enables **reverse inference** — deducing likely conditions from a patient's medication list alone. ## Stats | Metric | Count | |---|---| | Concept nodes | 2,471 | | Drug nodes | 1,186 | | LabInvestigation nodes | 812 | | **Total relationships** | **3,566** | | Concept → Concept (CHILD_OF) | 1,768 | | Concept → Drug (TREATED_BY) | 908 | | LabInvestigation → Concept (IMPACTS) | 808 | | Concept → LabInvestigation (MONITORED_BY) | 82 | **Concept hierarchy:** System `14` → Group `250` → Granular `1,942` **LabInvestigation LOINC coverage:** 812 LOINC-mapped tests across Immunological, Renal, Hematological, Endocrine, and Gastrointestinal domains. ## Files | File | Description | |---|---| | `triples.jsonl` | `(head, relation, tail, properties)` structured triples | | `nl_facts.jsonl` | Natural-language fact strings, suitable for LLM fine-tuning / RAG | For Neo4j dump, CSV, PyTorch Geometric, and RDF/Turtle formats, see the [GitHub repository](https://github.com/eka-care/BODHI). ## Schema (triples) Each line in `triples.jsonl`: ```json { "head": "<node_id>", "head_type": "Concept | Drug | LabInvestigation", "relation": "CHILD_OF | TREATED_BY | IMPACTS | MONITORED_BY", "tail": "<node_id>", "tail_type": "Concept | Drug | LabInvestigation", "properties": { ... } } ``` ## Standards - **SNOMED CT** — all concept nodes carry SNOMED IDs - **LOINC** — all lab investigation nodes carry LOINC IDs ## Use Cases - **Reverse inference** — deduce likely conditions from a patient's medication history - **Patient health profiling** — build richer longitudinal views from fragmented health data - **GraphRAG** — structured grounding for LLMs on treatment and investigation reasoning - **GNN training** — heterogeneous graph with multi-class nodes and typed edges ## License [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) — free for non-commercial use with attribution to [Eka Care](https://www.eka.care).
提供机构:
ekacare
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作