British Medical Journal Archive (1857-1930)
收藏Snowflake2026-04-07 更新2026-04-09 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZSXZGPW3SIF
下载链接
链接失效反馈官方服务:
资源简介:
Complete 1857 archive of the British Medical Journal (BMJ), one of the world's oldest and most respected general medical journals. **791,472 rows** of clean, structured medical text capturing Victorian-era medicine at its height.
**What this data does for your model:**
- Your model learns authentic mid‑19th century British medicine from one of the world's most respected general medical journals, founded in 1840.
- Your model retrieves original research from 1857, including clinical case studies of Addison's disease (suprarenal capsules), pericarditis, rheumatic fever, and tuberculosis as they were first documented .
- Your model trains on the language of Victorian pharmacology (calomel, opium, morphia, blisters), post‑mortem examinations, and the transition from humoral to pathological medicine
- Your model understands the reformist voice of the BMJ, including its early advocacy for medical education, public health, and professional standards .
**What's inside:**
- Original research from leading British physicians
- Clinical case studies of tuberculosis, rheumatic fever, pericarditis, and Addison's disease
- References to Dr. Thomas Addison's pioneering work on suprarenal capsules (Addison's disease)
- Post-mortem examinations and pathological findings
- Victorian pharmacology, calomel, opium, blisters, and morphia
- Medical debates and professional correspondence
**Perfect for:**
- LLM fine-tuning on 19th-century British medical text
- Clinical NLP research on historical disease presentation
- History of medicine and digital humanities
- Pharmaceutical AI training on historical therapeutics
**Format:** Snowflake-native JSONL with columns: ISSUE, TITLE, AUTHOR, TYPE, TEXT. Fully cleaned, bias-audited, and ready for AI training.
提供机构:
Devin Media Corp.
创建时间:
2026-04-07
原始信息汇总
British Medical Journal Archive (1857-1930) 数据集概述
数据集基本信息
- 数据集名称:British Medical Journal Archive (1857-1930)
- 提供商:Devin Media Corp.
- 数据行数:791,472 行
- 数据格式:Snowflake-native JSONL
- 数据列:ISSUE, TITLE, AUTHOR, TYPE, TEXT, INGESTION_DATE
- 时间范围:1857年至1930年(维多利亚时代)
- 数据内容:完整的《英国医学杂志》(BMJ) 存档,包含清洁、结构化的医学文本。
数据内容详情
涵盖内容
- 英国领先医师的原创研究
- 临床病例研究(涉及结核病、风湿热、心包炎、艾迪生病等)
- 托马斯·艾迪生博士关于肾上腺(艾迪生病)的开创性工作参考文献
- 尸检和病理学发现
- 维多利亚时代药理学(如甘汞、鸦片、水泡、吗啡)
- 医学辩论和专业通信
数据特征
- 数据已完全清洁、经过偏见审核,并准备好用于AI训练。
- 数据为预处理的JSONL格式,可直接用于机器学习。
适用场景
机器学习
- 针对791,000多行19世纪英国医学文本进行机器学习模型的训练、微调和部署。
- 适用于领域特定的LLM微调、医学术语演变跟踪和临床NLP模型开发。
真实世界数据 (RWD)
- 利用历史记录的临床病例、尸检结果和治疗结果作为真实世界数据进行研究和分析。
- 捕捉维多利亚时代的疾病表现、病理学和治疗学。
生命科学商业化
- 通过记录的19世纪英国治疗方法的精选历史医学文献支持生命科学研究。
- 跟踪治疗范式的演变,了解现代英国医学的基础。
数据字典
表名:BMJ_CORPUS
| 列名 | 数据类型 | 描述 |
|---|---|---|
| ISSUE | Varchar | 期刊号 |
| TITLE | Varchar | 文章标题 |
| AUTHOR | Varchar | 作者 |
| TYPE | Varchar | 文章类型(如article, metadata) |
| TEXT | Varchar | 文章正文 |
| INGESTION_DATE | Timestamp_NTZ | 数据摄取日期 |
使用示例
查看元数据文档
sql SELECT TITLE, TEXT FROM BMJ_CORPUS WHERE TYPE = metadata LIMIT 5;
按疾病搜索(艾迪生病/肾上腺)
sql SELECT TITLE, AUTHOR, ISSUE FROM BMJ_CORPUS WHERE TYPE = article AND TEXT ILIKE %addison% OR TEXT ILIKE %suprarenal% LIMIT 10;
搜索临床病例
sql SELECT ISSUE, TITLE, AUTHOR FROM BMJ_CORPUS WHERE TYPE = article AND TEXT ILIKE %case% LIMIT 10;
数据更新与覆盖范围
- 更新频率:每年
- 地理覆盖:全球
- 云区域可用性:支持AWS多个区域(如亚太地区雅加达、吉隆坡、孟买、大阪等)
提供商信息
- 提供商名称:Devin Media Corp.
- 提供商专长:提供用于AI训练的高级历史数据,涵盖医学、金融、时尚、法律和文化等领域。
- 数据集特点:所有数据集均为1930年前、经过验证的公共领域/无版权内容,经过专业OCR处理和严格清洁,具有来源追踪和偏见审核,格式为JSONL,通过安全API交付。



