Data mining models to predict timber production across Colombian departments. - Dataset
收藏Mendeley Data2026-05-21 收录
下载链接:
https://data.mendeley.com/datasets/7gjg9s77yp
下载链接
链接失效反馈官方服务:
资源简介:
Descripción:
This repository contains the data and code supporting the research article "Data mining models to predict timber production across Colombian departments", developed as a master's thesis at Universidad Cooperativa de Colombia. The study applies machine learning and time series techniques to forecast quarterly timber mobilization volumes at the departmental level in Colombia, following the CRISP-DM methodology in R/RStudio 4.4.1.
File 1 – Base_de_datos_relacionada_con_madera_movilizada_proveniente_de_Plantaciones_Forestales_Comerciales.xlsx: Raw open-access database published by the Colombian Agricultural Institute (ICA), retrieved from the Colombian Open Data Portal (datos.gov.co). Contains 53,856 records across 9 variables: year, semester, quarter, department, municipality, timber species, product type, data source, and mobilized volume (m³). Covers 28 departments, 699 municipalities, and 144 timber species between 2012 and 2022. Provided without modifications, preserving the original structure as downloaded from the official source.
File 2 – tesis.R: Fully commented R script (795 lines) implementing the complete analytical pipeline: data cleaning and preprocessing, spatio-temporal analysis with annual choropleth maps (GADM cartography), missing value imputation using KSSA with automatic best-fit selection (departments with >30% missing values excluded), and predictive modeling fitting five algorithms per department (ARIMA, Prophet, GLMNET, Random Forest, Prophet Boost) with a 90/10 temporal split. Models are evaluated using RMSE, MAE, MAPE, SMAPE, and MASE, and four-quarter-ahead forecasts are generated for each department. Main packages: tidymodels, modeltime, kssa, ggplot2, sf, ranger.
Keywords: Timber production; Colombia; Data mining; Machine learning; ARIMA; Random Forest; CRISP-DM; Time series; Open data; R.
创建时间:
2026-05-09



