five

large text collection about world-wide Flora taxonomic and morphological descriptions

收藏
DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20027402
下载链接
链接失效反馈
官方服务:
资源简介:
The file is a RO-CRATE directory offering a text dataset. The dataset contains 345,061 files, each corresponding to a distinct plant species, and is compressed into a single archive (Flore_PdfFileSpecies.zip). This archive is organized into directories by geographic area. Two complementary archives are also provided, containing OCR-processed text from botanical monographs and data collected from online botanical resources. In total, three data packages are available: Flore_PdfFileSpecies.zip – species-specific files generated after OCR segmentation and extraction; Flore_PdfFileOCR.zip – OCR-processed text extracted from digitized botanical monographs; Flore_WebFileSpecies.zip – raw data collected from various botanical websites. ro-crate-metadata.json is a JSON file describing the research object, compliant with the RO-Crate specification. It documents the organization of the files (number of files, subsets, total size, etc.). ConfidenceLabel.tsv is a tabular (\t as delimiter) metadata file that tags each file with a confidence score (High, Medium, Low).
提供机构:
Zenodo
创建时间:
2026-05-04
二维码
社区交流群
二维码
科研交流群
商业服务