Text corpus of Kepler's Astronomia nova
收藏Mendeley Data2024-05-10 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/5838228
下载链接
链接失效反馈官方服务:
资源简介:
The JSON file contains preprocessed paragraphs of Kepler’s Astronomia Nova for machine learning. The database is derived from Donahue’s translation: Kepler, Johannes, New Astronomy, rev. edition, tr. by William H. Donahue, Green Lion Press, 2015. The text was digitized using OCR and automated text processing aiming at “pure” text containing machine-readable sentences in UTF8. Special characters, reference marks, and other markings were removed. OCR artefacts and errors may remain. For the authoritative text see Donahue’s edition. Digital Latin version cf. Kepler’s Gesammelte Werke.
本JSON文件收录了用于机器学习的开普勒《新天文学》(Astronomia Nova)预处理段落文本。本数据集源自威廉·H·多纳休的译本:约翰内斯·开普勒,《新天文学》修订版,威廉·H·多纳休译,绿狮出版社(Green Lion Press),2015年。文本通过光学字符识别(Optical Character Recognition, OCR)与自动化文本处理技术完成数字化,旨在生成包含UTF-8编码机器可读语句的"纯净"文本。特殊字符、引用标记及其他标识均已移除,但仍可能存在OCR伪影与识别错误。如需获取权威文本,请参阅多纳休的修订版。拉丁文本电子版可参考《开普勒全集》(Kepler’s Gesammelte Werke)。
创建时间:
2023-06-28



