TrialPanorama/TrialPanorama-database

Name: TrialPanorama/TrialPanorama-database
Creator: TrialPanorama
Published: 2025-08-05 03:44:15
License: 暂无描述

Hugging Face2025-08-05 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/TrialPanorama/TrialPanorama-database

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 language: - en tags: - medical size_categories: - 10M<n<100M config_names: - studies - conditions - drugs - disposition - outcomes - results - biomarkers - endpoints - relations - drug_moa configs: - config_name: studies data_files: - split: all path: studies*.parquet - config_name: conditions data_files: - split: all path: conditions*.parquet - config_name: drugs data_files: - split: all path: drugs*.parquet - config_name: disposition data_files: - split: all path: disposition*.parquet - config_name: outcomes data_files: - split: all path: outcomes*.parquet - config_name: adverse_events data_files: - split: all path: adverse_events*.parquet - config_name: results data_files: - split: all path: results*.parquet - config_name: biomarkers data_files: - split: all path: biomarkers*.parquet - config_name: endpoints data_files: - split: all path: endpoints*.parquet - config_name: relations data_files: - split: all path: relations.parquet - config_name: drug_moa data_files: - split: all path: drug_moa.parquet --- ## Quick start The easiest way to download the dataset to your local is to use `huggingface-cli`. The specific command you can use is ``` huggingface-cli download zifeng-ai/TrialPanorama-database --local-dir LOCAL_DIR --repo-type dataset ``` where `LOCAL_DIR` should be replaced with the target directory you want to save your dataset to. # Update history - **Aug.4 2025**: updated tables with the full set of studies Dataset website: https://ryanwangzf.github.io/projects/trialpanorama --- # TrialPanorama **TrialPanorama** is a large-scale, structured database and benchmark designed to support AI-driven tasks in clinical trial workflows—including systematic review and trial design. This repo hosts the database of clinical trials. To conduct benchmarking experiments on trial design and systematic review asks, check this dataset instead: https://huggingface.co/datasets/zifeng-ai/TrialPanorama-benchmark ## Dataset Overview * Aggregates over 1M clinical trial records from ClinicalTrials and PubMed * Captures standardized elements such as trial setups, interventions, conditions, biomarkers, outcomes, and links to biomedical ontologies (e.g. DrugBank, MedDRA). * Structured into multiple conceptual clusters and tables (e.g. trial-level attributes, protocol design, results, links) ## Benchmark Suite The dataset supports a suite of **8 benchmark tasks** across two domains: * **Systematic Review**: * Study search * Study screening * Evidence summarization * **Trial Design**: * Arm design * Eligibility criteria * Endpoint selection * Sample size estimation * Trial completion assessment ## Data Schema (Major Tables) * **`studies`** — core metadata (e.g. study\_id, intervention type, sponsor type, start year, trial phase, recruitment status) * **`protocols`**, **`interventions`**, **`conditions`**, **`biomarkers`**, **`outcomes`**, **`results`** — each standardized to a unified schema ##  Use Cases * Developing and evaluating LLMs and ML models for clinical trial-related tasks * Benchmarking AI capabilities in trial search, screening, design, and summarization * Conducting meta-analyses and exploring evidence synthesis across disease areas * Enabling interoperability with biomedical ontologies to support richer clinical trial reasoning ## Getting Started ##  Citation If you use TrialPanorama, we appreciate your citation: ``` @article{wang2025trialpanorama, title={TrialPanorama: Database and Benchmark for Systematic Review and Design of Clinical Trials}, author={Wang, Zifeng and Jin, Qiao and Lin, Jiacheng and Gao, Junyi and Pradeepkumar, Jathurshan and Jiang, Pengcheng and Danek, Benjamin and Lu, Zhiyong and Sun, Jimeng}, journal={arXiv preprint arXiv:2505.16097}, year={2025} } ```

提供机构：

TrialPanorama

5,000+

优质数据集

54 个

任务类型

进入经典数据集