Enriched Time-Series Dataset of Polish Forest Composition (2009-2019)

Name: Enriched Time-Series Dataset of Polish Forest Composition (2009-2019)
Creator: Gdańsk University of Technology
Published: 2026-02-24 11:40:01
License: 暂无描述

DataCite Commons2026-02-24 更新2026-05-04 收录

下载链接：

https://mostwiedzy.pl/en/open-research-data/enriched-time-series-dataset-of-polish-forest-composition-2009-2019,202602241234140876831-0

下载链接

链接失效反馈

官方服务：

资源简介：

VERSION 2.0 Data files generated using DOI hhttps://doi.org/10.34808/6qky-6063Visualizations (7 PNG files, 300 DPI) regenerated from 2009-2019 PDF-extracted data. Version 2.0 adds a PDF data extraction module (pdf_data_extractor.py, ~800 lines) that reads annual inventory reports (Aktualizacja_YYYY.pdf) using PyMuPDF (fitz) and produces a CSV compatible with the existing pipeline. Input processing now auto-detects the source file format. Four new output tables have been added alongside the existing eight. The main table no longer contains the Area_pct and Avg_Age columns, which are not available in the PDF source. New output tables: forest_data_organized_carbon_sequestration.csv - Carbon stock (tC) and CO₂ equivalents using IPCC Tier 1 defaults (BEF × Wood_Density × Carbon_Fraction = 0.47; CO₂-eq = C × 44/12); tagged ESTIMATED forest_data_organized_normal_forest_deviation.csv - Deviation from theoretical normal forest: mean|Actual_Pct_i – 25%|/4, Aging_Index = (Mature+Old)/(Young+Middle), Young_Forest_Deficit = max(0, 25% – Young_Pct) forest_data_organized_volume_class_analysis.csv - Area, volume and stocking density (m³/ha) by stocking class: I-II (Age 1-40), III-IV (Age 41-80), V+ (Age 81+) forest_data_organized_official_pdf_benchmarks.csv - RAW reference values from BULiGL 2017 inventory report for cross-validation Data provenance tracking: Added ESTIMATED category for model-based carbon stock values Input Requirements: PDF files: Aktualizacja_YYYY.pdf placed in pdfs/ subfolder, or Excel file (.xlsx) / CSV with columns: Species, Year, Tree_Type, Category, Area_ha, Volume_m3 Limitations: PDF extraction depends on consistent text layout across annual reports Carbon stock values are ESTIMATED (IPCC Tier 1, above-ground biomass only, ±30-50% uncertainty) Volume class analysis uses age-based approximation, not volume-based stocking class definitions Columns Area_pct and Avg_Age removed (not recoverable from PDF source) VERSION 1.1 Visualisation files generated using DOI IDhttps://doi.org/10.34808/p0nz-dv57 Visualizations (7 PNG files, 2.67 MB, 300 DPI):1. temporal_trends.png (212 KB) - Area/volume trends 2009-20192. diversity_indices.png (429 KB) - 4-panel biodiversity metrics3. age_structure.png (241 KB) - Age distribution4. composition_trends.png (177 KB) - Coniferous vs deciduous5. growth_productivity.png (390 KB) - MAI, CAI, volume/ha6. species_comparison.png (168 KB) - Species distribution (2019)7. comprehensive_dashboard.png (654 KB) - 9-panel overview This dataset contains a Python tool for processing forest inventory data and calculating standard forestry and ecological metrics. The tool reads Excel spreadsheets containing forest inventory data and outputs organized datasets with derived variables. All formulas based on published literature. See code documentation for specific citations. The related dataset https://doi.org/10.34808/jks7-0274 provides the processing tool for users with their own inventory data or custom analysis requirements., whereas this dataset provides processed data for immediate use. Input processing: Reads XLSX format inventory file Standardizes species names to forestry codes Parses age class categories Classifies tree types and age categories Metric calculation: Biodiversity indices: Shannon Index (H' = -Σ(pi × ln(pi))), Simpson Index (D = 1 - Σ(pi²)), Pielou's Evenness (J' = H'/ln(S)), Species Richness Age structure: Weighted mean age, age class distribution percentages, coefficient of variation Composition: Coniferous/deciduous ratios by area and volume, dominant species identification Growth: Mean Annual Increment (Volume/(Area × Age)), Current Annual Increment (ΔVolume/Area) Temporal: Year-over-year changes in area and volume Data provenance tracking: Tags each derived value with source type Categories: RAW, RAW_SUMMED, CALCULATED, COUNTED, DETERMINED, ESTIMATED Generates metadata file explaining categories Implementation Details: Language: Python 3.7+ Dependencies: pandas (≥1.3.0), numpy (≥1.21.0), scipy (≥1.7.0), openpyxl (≥3.0.0) Lines of code: ~680 Processing time: <10 seconds for 1,260 records Input Requirements: Excel file (.xlsx) with columns: Species (text) Year (integer) Tree_Type (text) Category (text, e.g., "Age_41-60") Volume_m3 (float) Area_ha (float) Documentation: README: Installation, usage, API reference Usage examples: 16 practical scenarios Validation report: Detailed test results Limitations: - Designed for hierarchical age-class inventory data- Assumes standard forestry category naming conventions- Growth metrics require age class information- MAI calculation depends on age class midpoint estimates

VERSION 2.0 本数据集版本2.0，所用数据文件生成自DOI（数字对象标识符，Digital Object Identifier）https://doi.org/10.34808/6qky-6063。可视化图表（共7张PNG文件，分辨率300 DPI）由2009-2019年提取自PDF的原始数据重新生成。版本2.0新增了PDF数据提取模块（pdf_data_extractor.py，约800行代码），该模块依托PyMuPDF（fitz）库读取年度森林清查报告（Aktualizacja_YYYY.pdf），并生成可兼容现有处理流程的CSV格式文件。输入处理环节现已支持自动识别源文件格式。在原有8张输出表格的基础上，新增4张输出表格。主表格不再包含Area_pct与Avg_Age列，该两列无法从PDF源文件中获取。新增输出表格如下： 1. forest_data_organized_carbon_sequestration.csv：基于政府间气候变化专门委员会（IPCC，Intergovernmental Panel on Climate Change）Tier 1默认参数计算的碳储量（单位：tC，吨碳）与二氧化碳当量（CO₂-eq），计算公式为BEF × 木材密度 × 碳占比=0.47；CO₂-eq = C × 44/12，结果标记为ESTIMATED（估算值）。 2. forest_data_organized_normal_forest_deviation.csv：与理论标准林分的偏差值，计算公式为mean|Actual_Pct_i – 25%|/4，衰老指数（Aging_Index）=(成熟林+过熟林)/(幼龄林+中龄林)，幼龄林赤字（Young_Forest_Deficit）=max(0, 25% – Young_Pct)。 3. forest_data_organized_volume_class_analysis.csv：按林分密度等级划分的面积、蓄积量与林分密度（单位：m³/ha，立方米每公顷）：I-II级（林龄1-40年）、III-IV级（林龄41-80年）、V+级（林龄81年及以上）。 4. forest_data_organized_official_pdf_benchmarks.csv：来自BULiGL 2017年度清查报告的原始参考值，用于交叉验证。数据溯源追踪：新增基于模型的碳储量值的ESTIMATED（估算值）分类标签。输入要求： PDF文件：需将Aktualizacja_YYYY.pdf放置于pdfs/子文件夹中；或 Excel文件（.xlsx）/CSV文件，需包含以下列：Species（物种）、Year（年份）、Tree_Type（林木类型）、Category（类别）、Area_ha（公顷面积）、Volume_m3（蓄积量，立方米）局限性： 1. PDF数据提取依赖于年度报告中一致的文本排版格式 2. 碳储量值为估算值（采用IPCC Tier 1方法，仅包含地上生物量，不确定性范围为±30%-50%） 3. 林分密度等级分析采用基于林龄的近似划分，而非基于蓄积量的林分密度等级定义 4. 已移除Area_pct与Avg_Age列（无法从PDF源文件中恢复） VERSION 1.1 本数据集版本1.1，可视化文件生成自DOI https://doi.org/10.34808/p0nz-dv57。可视化图表（共7张PNG文件，总大小2.67 MB，分辨率300 DPI）： 1. temporal_trends.png（212 KB）：2009-2019年面积与蓄积量变化趋势 2. diversity_indices.png（429 KB）：4面板生物多样性指标 3. age_structure.png（241 KB）：年龄结构分布 4. composition_trends.png（177 KB）：针叶林与阔叶林占比变化趋势 5. growth_productivity.png（390 KB）：平均年生长量（MAI，Mean Annual Increment）、连年生长量（CAI，Current Annual Increment）、单位面积蓄积量 6. species_comparison.png（168 KB）：2019年物种分布情况 7. comprehensive_dashboard.png（654 KB）：9面板综合概览本数据集包含一款用于处理森林清查数据并计算标准林业与生态指标的Python工具。该工具可读取包含森林清查数据的Excel表格，并输出带有衍生变量的结构化数据集，所有计算公式均基于已发表的学术文献，具体引用信息请参阅代码文档。相关数据集https://doi.org/10.34808/jks7-0274 为拥有自有清查数据或自定义分析需求的用户提供了处理工具，而本数据集则提供可直接使用的已处理数据。输入处理：支持读取XLSX格式的清查文件将物种名称标准化为林业编码解析年龄组类别对林木类型与年龄类别进行分类指标计算： 1. 生物多样性指标：香农指数（Shannon Index，H' = -Σ(pi × ln(pi))）、辛普森指数（Simpson Index，D = 1 - Σ(pi²)）、皮尔洛均匀度指数（Pielou's Evenness，J' = H'/ln(S)）、物种丰富度 2. 年龄结构：加权平均林龄、年龄组分布占比、变异系数 3. 林分组成：按面积与蓄积量划分的针叶林/阔叶林占比、优势树种识别 4. 生长指标：平均年生长量（MAI，Mean Annual Increment = 蓄积量/(面积×林龄)）、连年生长量（CAI，Current Annual Increment = Δ蓄积量/面积） 5. 时间序列分析：年度面积与蓄积量的同比变化数据溯源追踪：为每个衍生值标记来源类型，分类包括：RAW（原始数据）、RAW_SUMMED（原始汇总数据）、CALCULATED（计算值）、COUNTED（计数数据）、DETERMINED（确定值）、ESTIMATED（估算值），并生成元数据文件说明各类别含义。实现细节：开发语言：Python 3.7及以上版本依赖库：pandas（≥1.3.0）、NumPy（≥1.21.0）、SciPy（≥1.7.0）、openpyxl（≥3.0.0）代码行数：约680行处理耗时：处理1260条记录所需时间少于10秒输入要求：需提供包含以下列的Excel文件（.xlsx）： Species（物种，文本格式）、Year（年份，整数格式）、Tree_Type（林木类型，文本格式）、Category（类别，文本格式，例如"Age_41-60"）、Volume_m3（蓄积量，浮点数格式）、Area_ha（公顷面积，浮点数格式）文档说明： README文件：涵盖安装方法、使用指南与API参考文档包含16个实际应用场景的使用示例验证报告：包含详细的测试结果局限性： 1. 仅适用于层级式年龄组森林清查数据 2. 假设采用标准林业类别命名规范 3. 生长指标计算需提供年龄组信息 4. 平均年生长量（MAI）的计算依赖于年龄组中点的估算值

提供机构：

Gdańsk University of Technology

创建时间：

2026-02-24

5,000+

优质数据集

54 个

任务类型

进入经典数据集