large text collection about world-wide Flora taxonomic and morphological descriptions
收藏DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20027402
下载链接
链接失效反馈官方服务:
资源简介:
The file is a RO-CRATE directory offering a text dataset.
The dataset contains 345,061 files, each corresponding to a distinct plant species, and is compressed into a single archive (Flore_PdfFileSpecies.zip). This archive is organized into directories by geographic area. Two complementary archives are also provided, containing OCR-processed text from botanical monographs and data collected from online botanical resources.
In total, three data packages are available:
Flore_PdfFileSpecies.zip – species-specific files generated after OCR segmentation and extraction;
Flore_PdfFileOCR.zip – OCR-processed text extracted from digitized botanical monographs;
Flore_WebFileSpecies.zip – raw data collected from various botanical websites.
ro-crate-metadata.json is a JSON file describing the research object, compliant with the RO-Crate specification. It documents the organization of the files (number of files, subsets, total size, etc.).
ConfidenceLabel.tsv is a tabular (\t as delimiter) metadata file that tags each file with a confidence score (High, Medium, Low).
提供机构:
Zenodo
创建时间:
2026-05-04



