Johns Hopkins Medical Journal Archive (1890-1926)
收藏Snowflake2026-04-02 更新2026-04-03 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZSXZGPW3RA4
下载链接
链接失效反馈官方服务:
资源简介:
This comprehensive archive contains 132,150 rows of fully processed medical text from the Johns Hopkins Medical Journal (1890-1926), one of the most influential medical publications in American history. Every article has been professionally extracted, cleaned, and formatted for AI/ML training applications.
**What this data does for your model:**
- Your model learns authentic turn‑of‑the‑century American medicine from one of the most influential medical institutions in history, founded on German academic principles.
- Your model retrieves original research by William Osler ("The Father of Modern Medicine"), Howard Kelly, William Halsted, and other pioneers who revolutionized surgery, gynecology, and pathology.
- Your model trains on the language of clinical case studies, surgical innovations, and the shift from empirical practice to evidence‑based medicine.
- Your model understands the founding era of American medical research, documenting the rise of Johns Hopkins as the model for modern medical education.
**What's inside:**
- Original articles by William Osler, Howard Kelly, William Halsted, and other Johns Hopkins pioneers
- Clinical case studies from one of America's first research hospitals
- Surgical innovations including early descriptions of appendicitis and the development of aseptic technique
- Foundational research in cardiology, neurology, and infectious disease
- The professionalization of American medicine at the turn of the 20th century
**Perfect for:**
- LLM fine‑tuning on late 19th‑century American medical text
- Clinical NLP and medical terminology evolution tracking
- History of medicine and digital humanities
- Medical education and curriculum development
**Format:** Snowflake-native JSONL with columns: ISSUE, TITLE, AUTHOR, TYPE, TEXT. Fully cleaned, bias‑audited, and ready for AI training.
<p><br/></p>
提供机构:
Devin Media Corp.
创建时间:
2026-03-31
原始信息汇总
Johns Hopkins Medical Journal Archive (1890-1926) 数据集概述
数据集基本信息
- 数据集名称: Johns Hopkins Medical Journal Archive (1890-1926)
- 提供商: Devin Media Corp.
- 数据量: 132,150 行
- 数据来源: Johns Hopkins Medical Journal (1890-1926)
- 数据描述: 包含来自约翰霍普金斯医学杂志(1890-1926年)的完全处理过的医学文本,该杂志是美国历史上最具影响力的医学出版物之一。每篇文章都经过专业提取、清理和格式化,适用于AI/ML训练应用。
业务需求与应用场景
机器学习
- 用于在高质量历史医学文本数据上训练、微调和部署机器学习模型。
- 包含132,000多行来自现代医学奠基时代的精选医学文献。
- 适用场景:
- 领域特定LLM微调
- 医学术语提取
- 临床NLP模型开发
- 历史生物医学知识图谱
- 医学文本中的命名实体识别
真实世界数据
- 利用历史记录的临床病例、治疗结果和疾病描述作为研究和分析的真实世界数据。
- 档案包含:
- 约翰霍普金斯医生的原始病例研究
- 有记录的手术程序和结果
- 包括糖尿病、阑尾炎和心血管疾病在内的疾病的临床观察
- 历史患者陈述和治疗方案
- 1900年以前的流行病学数据
生命科学商业化
- 通过记录治疗方法发现和早期应用的精选历史医学文献,支持生命科学研究和开发。
- 该数据集支持:
- 追踪1890-1901年治疗范式的演变
- 理解导致现代疗法的基础研究
- 分析历史临床试验方法
- 识别疾病管理的长期模式
- 在医学向循证实践过渡时期的主要来源医学文献上训练模型
数据字典
表名: JOHNS_HOPKINS_CORPUS
列信息:
- ISSUE: Varchar
- TITLE: Varchar
- AUTHOR: Varchar
- TYPE: Varchar
- TEXT: Varchar
- INGESTION_DATE: Timestamp_NTZ
数据预览示例:
- ISSUE: 1925-Vol._37_No._4
- TITLE: the oil-immersion field and the suspension was in most instances given
- AUTHOR: Unknown
- TYPE: article
- TEXT: the oil-immersion field and the suspension was in most instances given intravenously, although a few animals were inoculated intraperitoneally. Sabin (24) found that, in a case of Malta fever, there was a large increase in the number of the circulating monocytes, and an additional increase in their phagocytic activity as indicated by their staining with neutral red. With this concept in mind it occurred to us that perhaps B. abortus, an organism closely related to the bacillus melitensis, might bring about a stimulation of the monocytes in our experimental animals and thus supply us with a mechanism for analyzing the results of infection with tuberculosis in the case of previously stimulated animals. Throughout these experiments we have taken blood counts at as close intervals as was possible, many of the experimental animals having been counted daily for periods of six to seven weeks. This factor of making daily supra-vital differential counts, as well as counts of the total white blood-cells, has rendered the utilization of a larger series of animals technically impossible so that, while we recognize that our series must appear small to those workers studying allergic, serological and immunological reactions, nevertheless, it was as large as it was possible for a small group of workers to carry through. And furthermore, our results have been so striking and so easy to classify into specific groups, with regard to the monocytic reactions, that it has seemed fully justifiable to consider the series quite large enough to make reliable conclusions possible. We have counted the blood of all experimental animals several times before injections and, whenever possible, this period of preliminary counting has been extended to several weeks duration. It is a customary opinion that the blood counts in rabbits vary much more widely than in the other animals and we were inclined in our earlier experiments to concur in this opinion, but we found, when care was taken to have such a dilatation of
- INGESTION_DATE: null
使用示例
查看元数据文档
sql SELECT TITLE, TEXT FROM JOHNS_HOPKINS.JOHNS_HOPKINS_CORPUS WHERE TYPE = metadata LIMIT 5;
按作者搜索文章
sql SELECT TITLE, AUTHOR, ISSUE FROM JOHNS_HOPKINS.JOHNS_HOPKINS_CORPUS WHERE TYPE = article AND AUTHOR LIKE %Osler% LIMIT 10;
查找医学主题文章
sql SELECT TITLE, AUTHOR, ISSUE FROM JOHNS_HOPKINS.JOHNS_HOPKINS_CORPUS WHERE TYPE = article AND TEXT ILIKE %appendicitis% OR TEXT ILIKE %surgery% LIMIT 10;
数据集技术与管理信息
- 刷新频率: 每年
- 地理覆盖范围: 全球
- 云区域可用性 (AWS):
- Asia Pacific (Jakarta)
- Asia Pacific (Mumbai)
- Asia Pacific (Osaka)
- Asia Pacific (Seoul)
- 34 More
- 法律条款: Standard
提供商信息
- 提供商名称: Devin Media Corp.
- 提供商描述: Devin Media Corp专注于为AI训练提供优质历史数据。我们提供全面、来源可追溯、经过偏见审计、1930年以前的出版物和档案,经过专业清理和结构化,适用于机器学习应用。我们的数据集涵盖医学、金融、时尚、法律和文化,包括一些社会最负盛名和标志性的出版物,如《美国医学会杂志》(JAMA)、《时代》杂志、《公告牌》杂志、《Vogue》、《House & Garden》和《纽约邮报》。每个数据集都具备以下特点:
- 1930年以前,并验证为公共领域/无版权
- 经过专业OCR处理和深度清理
- 来源可追溯且经过偏见审计
- 格式化为JSONL,适合AI使用
- 通过安全API交付(无文件下载)
- 联系方式:
- 销售: hello@devinmediacorp.com
- 支持: hello@devinmediacorp.com



