Historical Chemistry & Science Archive AI Ready (pre‑1930)
收藏Snowflake2026-05-20 更新2026-05-21 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZSXZGPW4ITL
下载链接
链接失效反馈官方服务:
资源简介:
This is not a collection of scattered articles. This is the written record of modern science taking form, from the discovery of the electron to the professionalization of chemistry, from the founding of the History of Science Society to the rise of science journalism.
<p><br/></p>
17 journals. 58 years of scientific progress (1872‑1930). Cleaned. Structured. Bias‑audited.
<p><br/></p>
**Included in this collection:**
<p><br/></p>
**The Pillars of Chemistry**
- American Chemical Journal (1879‑1930)
- Journal of Physical Chemistry (1896‑1930)
- Chemical Engineering (1902‑1930)
- Biological Chemistry (1905‑1930)
<p><br/></p>
**The History of Science**
- Isis (1913‑1930) – the official journal of the History of Science Society
- Journal of the Franklin Institute (1826‑1930), mechanical and physical science
<p><br/></p>
**General Science & Science Journalism**
- Popular Science Monthly (1872‑1930)
- Scientific Monthly (1915‑1930)
- Science News (1921‑1930)
- Knowledge (1881‑1930)
- Natural Science (1892‑1930)
<p><br/></p>
**Science Education**
- Science Education (1916‑1930)
<p><br/></p>
**Regional & Niche Scientific Periodicals**
- Elisha Mitchell Scientific Society Journal (1884‑1930)
- And 6 additional journals covering the breadth of late 19th‑ and early 20th‑century science
<p><br/></p>
**What you can build with this archive:**
<p><br/></p>
- Train chemistry‑focused LLMs on the papers that defined modern chemistry, from the birth of physical chemistry to the rise of chemical engineering
- Extract chemical entities, species names, analytical techniques, and scientific concepts at scale
- Study the evolution of scientific methodology, the professionalization of disciplines, and the emergence of science as a public good
- Build educational AI models on a century of science pedagogy from Science Education and Popular Science
- Support research in bibliometrics, citation analysis, and the history of science
<p><br/></p>
Every article has been professionally OCR‑cleaned, reconstructed, and structured as JSONL. Every file includes full provenance tracking and a bias audit with historical context notices.
<p><br/></p>
100% pre‑1930 public domain. Snowflake‑optimized. Ready for LLM pretraining, fine‑tuning, and scientific NLP.
<p><br/></p>
提供机构:
Devin Media Corp.
创建时间:
2026-05-20
原始信息汇总
数据集概述
该数据集名为 Historical Chemistry & Science Archive AI Ready (pre‑1930),由 Devin Media Corp. 提供,发布于 Snowflake Marketplace。它是一个专注于化学、生物学及综合科学领域的历史文献档案,内容经过专业清理和结构化处理,适合用于人工智能训练。
核心信息
- 数据集名称: Historical Chemistry & Science Archive AI Ready (pre‑1930)
- 提供方: Devin Media Corp.
- 数据范围: 涵盖 1872 年至 1930 年间的科学文献,时间跨度共计 58 年。
- 数据来源: 包含 17 种科学期刊,覆盖化学、科学史、综合科学、科学教育及区域性科学出版物。
- 数据状态: 所有内容均属于公共领域(pre‑1930),已进行专业 OCR 清理、重建,并结构化为 JSONL 格式。同时包含完整的溯源追踪和带有历史背景注释的偏差审计。
数据内容构成
数据集整合了多类期刊,具体包括:
- 化学核心期刊:
- American Chemical Journal (1879‑1930)
- Journal of Physical Chemistry (1896‑1930)
- Chemical Engineering (1902‑1930)
- Biological Chemistry (1905‑1930)
- 科学史期刊:
- Isis (1913‑1930) —— 科学史学会官方期刊
- Journal of the Franklin Institute (1826‑1930) —— 机械与物理科学
- 综合科学与科学新闻:
- Popular Science Monthly (1872‑1930)
- Scientific Monthly (1915‑1930)
- Science News (1921‑1930)
- Knowledge (1881‑1930)
- Natural Science (1892‑1930)
- 科学教育期刊:
- Science Education (1916‑1930)
- 区域性与专门科学期刊:
- Elisha Mitchell Scientific Society Journal (1884‑1930)
- 另有 6 种补充期刊,覆盖 19 世纪末至 20 世纪初的广泛科学领域
适用业务场景
该数据集适用于多种人工智能与自然语言处理任务:
- 模型开发: 训练面向化学、生物学等基础科学领域的领域特定大语言模型(LLM),支持预训练与微调。
- 实体识别: 高效提取化学化合物、反应、分析技术、物种名称及科学概念,清理后的 OCR 文本可实现高准确度的命名实体识别(NER)。
- 文本摘要: 每篇完整文章与原始标题构成天然的摘要对,可用于训练从冗长历史论文中生成简洁摘要的模型。
数据交付与更新
- 交付方式: 安全共享(Secure share)
- 更新频率: 每年更新(Annually)
- 定价模式: 需联系提供商获取具体价格信息(标注为“By request”)
提供商相关信息
- 提供商名称: Devin Media Corp.
- 联系邮箱: hello@devinmediacorp.com
- 提供商简介: 专注于为 AI 训练提供优质历史数据,其数据集涵盖医学、金融、时尚、法律、文化等领域,均经过专业清理、溯源追踪、偏差审计,并以 JSONL 格式交付,通过安全 API 提供。



