five

A Semantic Knowledge Graph Linking Diseases, Patterns, Symptoms, and Herbs for Traditional Chinese Medicine

收藏
DataCite Commons2026-05-07 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.18173423
下载链接
链接失效反馈
官方服务:
资源简介:
Version 2.0 Release Notice: This is Version 2.0 of the knowledge graph. This release introduces further terminology standardization to better capture nuanced clinical distinctions within Traditional Chinese Medicine, including the disambiguation of overlapping psychiatric, pathogenic-factor, and related clinical concepts. It also improves semantic granularity, terminology consistency, and structural alignment of the node table. Description: This dataset provides the core topological structure of a Traditional Chinese Medicine (TCM) efficacy knowledge graph. Unlike simple efficacy lists, this dataset constructs a full-semantic network integrating the hierarchical logic of "Etiology-Disease-Pattern-Symptom-Efficacy-Herb". The data are structured as a Property Graph model, containing standardized entities and their semantic relationships, extracted and normalized from authoritative TCM textbooks. It serves as the foundational graph structure for semantic reasoning and efficacy inference. Encoding Note: The edge file is UTF-8 compatible. The node file in this version may require GBK/GB18030-compatible decoding due to several special characters in entity names. Users who encounter decoding errors when reading the node file with Python or other tools are advised to specify encoding="gb18030" or encoding="gbk". Dataset Content: The dataset consists of two CSV files and one README file: Node File (node-v2-eng260502.csv): Contains 6,931 entities, including Herbs, Efficacies, Symptoms, Patterns, Diseases, and Etiologies. Edge File (edge-v2-eng260502.csv): Contains 16,708 semantic relationships, defining the logical connections (e.g., has_effect, treated_by, manifests_as, includes, transforms_to) between entities. README File (READMEv2.0.txt): Provides dataset documentation, version history, file descriptions, data dictionary, usage notes, limitations, license, and citation information. Key Features: Multi-layer Semantics: Covers the complete clinical reasoning chain from pathology to treatment. Standardized Terminology: Entities are normalized to ensure semantic consistency and improved clinical distinction. Graph-Ready: Formatted for direct import into graph databases (e.g., Neo4j, Gephi) or network analysis libraries (e.g., NetworkX). Version 2.0 Updates: Systematic Disambiguation: Decoupled historically ambiguous mappings, such as distinguishing "癫" as Depressive psychosis, "狂" as Manic psychosis, and "痫" as Epilepsy. Pathogenic Factors: Standardized translations for exogenous factors, such as using "pathogen" instead of "-evil". Structural Normalization: Standardized San Jiao terminology and removed hyphenated compound organs for better NLP parsing. Clinical Refinement: Refined pathological states and specialized vocabulary in gynecology and urology. Integrity Fix: Resolved a structural alignment issue in the node table to ensure data consistency. Contact: For questions, please contact: LI Yuanbai: liyuanbai126@126.com This work was supported by: Key Laboratory of TCM Language and Cognitive Artificial Intelligence, IICTM, CACMS. ZZSYS-1901-CZ: The Study on the Simplification of Medicinal Ingredients in Formulas Based on Efficacy Prediction. Beijing Natural Science Foundation (J230036): Integrating knowledge graph with the concept of network target to explore and develop innovative Chinese medicine based on aging mechanism in osteoarthritis. National Key Research and Development Program of China (2023YFC3504005): Development and construction of a real-world information platform for Traditional Chinese Medicine Quality.
提供机构:
Zenodo
创建时间:
2026-01-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作