Specialty Medical & Science Archive for Medical AI (pre‑1930)
收藏Snowflake2026-05-08 更新2026-05-09 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZSXZGPW4HST
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains 143,935 cleaned, bias‑audited rows from seven historical medical and science journals (pre‑1930). Includes:
<p><br/></p>
- Medical Century
- American Practitioner
- Dental Digest
- Journal of Experimental Psychology
- Parasitology
- Therapeutic Gazette
- Cancer Science
<p><br/></p>
All content is 100% public domain, professionally OCR‑cleaned, and structured as JSONL for immediate LLM training.
<p><br/></p>
**Use cases:**
- Train domain‑specific models in dentistry, psychology, oncology, and parasitology
- Extract clinical entities across multiple specialties
- Support longitudinal research in cancer science and infectious disease
- Develop bias‑aware models with built‑in historical context
<p><br/></p>
💡 **Example Cortex AI Agent Prompts:**
To evaluate the deep cross-disciplinary signal and semantic reach of this specialty medical and science archive, consumers can issue the following queries directly to the agent:
<p><br/></p>
1. "Analyze the intersection of chemical engineering breakthroughs and clinical surgical advancements, specifically focusing on the adoption of synthetic materials in hospital settings prior to 1925."
2. "Trace the historical documentation of public health policy shifts as influenced by advancements in laboratory bacteriology and industrial sanitation technologies during the early 20th century."
3. "Summarize the evolving collaborative relationship between early pharmaceutical research, clinical pathology findings, and the standardization of surgical methodology documented in major scientific periodicals."
<p><br/></p>
提供机构:
Devin Media Corp.
创建时间:
2026-05-08
原始信息汇总
数据集概述:Specialty Medical & Science Archive for Medical AI (pre‑1930)
这是一个经过清洗和偏差审计的历史医学与科学期刊数据集,专为医疗AI模型训练设计。
- 数据集提供方:Devin Media Corp.
- 数据规模:包含 143,935 条记录。
- 数据来源:涵盖七种预1930年的历史医学与科学期刊,所有内容均属于公共领域。
- 数据格式:结构化数据,以 JSONL 格式提供,适合直接用于 LLM 训练。
- 数据更新频率:每年更新一次。
- 交付方式:安全共享(Secure share)。
数据来源期刊
- Medical Century(医学世纪)
- American Practitioner(美国医师)
- Dental Digest(牙科文摘)
- Journal of Experimental Psychology(实验心理学杂志)
- Parasitology(寄生虫学)
- Therapeutic Gazette(治疗公报)
- Cancer Science(癌症科学)
数据字段(Data Dictionary)
数据表名为 SPECIALTY_MEDICAL_SCIENCE_ARCHIVE,包含以下列:
| 列名 | 数据类型 | 描述 |
|---|---|---|
| ISSUE | VARCHAR | 期刊期号标识 |
| TITLE | VARCHAR | 文章标题 |
| AUTHOR | VARCHAR | 文章作者 |
| TYPE | VARCHAR | 文章类型(如 article) |
| TEXT | VARCHAR | 文章正文内容 |
| INGESTION_DATE | Timestamp_NTZ | 数据摄入时间戳 |
主要用途(Business Needs)
- 模型开发:使用经过清洗和偏差审计的预1930年期刊,训练牙科、心理学、肿瘤学和寄生虫学等领域的专业 LLM。
- 实体识别:跨多个医学专业提取临床实体,如牙科手术、心理状况、寄生虫疾病和癌症术语。
- 文本摘要:全文本文章与原始标题配对,为训练抽象式和抽取式摘要模型提供自然源-摘要对。
- 临床路径分析:分析19世纪至20世纪早期诊断标准、治疗方法和专业知识如何演变。
其他信息
- 类别:AI & ML
- 联系人:销售与支持邮箱均为 hello@devinmediacorp.com
- 定价:需要联系获取(状态为 "Unlock New Insights")
- 相关数据集:该提供商还提供其他历史档案数据集,如《加拿大护士档案》、《医学与外科记者档案》等。



