ekacare/BODHI-M
收藏Hugging Face2026-04-13 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/ekacare/BODHI-M
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
language:
- en
tags:
- medical
- knowledge-graph
- clinical
- healthcare
- india
- snomed
- loinc
- graph
- drug
- lab-investigation
- medication
pretty_name: BODHI-M — Clinical Concept-Drug-Lab Investigation Knowledge Graph
size_categories:
- 1K<n<10K
---
# BODHI-M — Concept-Drug-Lab Investigation Knowledge Graph
Part of **BODHI (Bharat Ontology for Disease & Healthcare Informatics)** — open clinical knowledge graphs for grounding healthcare AI in verified medical facts.
→ [Full writeup: motivation, design & use cases](https://info.eka.care/services/bodhi-bharat-ontology-for-disease-healthcare-informatics)
→ [GitHub (all formats: Neo4j, CSV, PyG, RDF)](https://github.com/eka-care/BODHI)
---
## What is BODHI-M?
BODHI-M maps SNOMED-coded clinical concepts (disorders, findings, procedures, lifestyle factors) to their generic drug treatments and LOINC-coded lab investigations, organised in a three-level hierarchy: **System → Group → Granular**.
Built and validated by expert clinicians at [Eka Care](https://www.eka.care), it has powered production patient health profiling and longitudinal health views across millions of records in India.
The three-tier concept hierarchy is a core design choice: when a drug or lab result cannot confidently pinpoint a specific granular disease, the graph supports reliable "soft inference" at the broader System or Group level. This also enables **reverse inference** — deducing likely conditions from a patient's medication list alone.
## Stats
| Metric | Count |
|---|---|
| Concept nodes | 2,471 |
| Drug nodes | 1,186 |
| LabInvestigation nodes | 812 |
| **Total relationships** | **3,566** |
| Concept → Concept (CHILD_OF) | 1,768 |
| Concept → Drug (TREATED_BY) | 908 |
| LabInvestigation → Concept (IMPACTS) | 808 |
| Concept → LabInvestigation (MONITORED_BY) | 82 |
**Concept hierarchy:** System `14` → Group `250` → Granular `1,942`
**LabInvestigation LOINC coverage:** 812 LOINC-mapped tests across Immunological, Renal, Hematological, Endocrine, and Gastrointestinal domains.
## Files
| File | Description |
|---|---|
| `triples.jsonl` | `(head, relation, tail, properties)` structured triples |
| `nl_facts.jsonl` | Natural-language fact strings, suitable for LLM fine-tuning / RAG |
For Neo4j dump, CSV, PyTorch Geometric, and RDF/Turtle formats, see the [GitHub repository](https://github.com/eka-care/BODHI).
## Schema (triples)
Each line in `triples.jsonl`:
```json
{
"head": "<node_id>",
"head_type": "Concept | Drug | LabInvestigation",
"relation": "CHILD_OF | TREATED_BY | IMPACTS | MONITORED_BY",
"tail": "<node_id>",
"tail_type": "Concept | Drug | LabInvestigation",
"properties": { ... }
}
```
## Standards
- **SNOMED CT** — all concept nodes carry SNOMED IDs
- **LOINC** — all lab investigation nodes carry LOINC IDs
## Use Cases
- **Reverse inference** — deduce likely conditions from a patient's medication history
- **Patient health profiling** — build richer longitudinal views from fragmented health data
- **GraphRAG** — structured grounding for LLMs on treatment and investigation reasoning
- **GNN training** — heterogeneous graph with multi-class nodes and typed edges
## License
[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) — free for non-commercial use with attribution to [Eka Care](https://www.eka.care).
提供机构:
ekacare



