five

Data mining models to predict timber production across Colombian departments. - Dataset

收藏
Mendeley Data2026-05-21 收录
下载链接:
https://data.mendeley.com/datasets/7gjg9s77yp
下载链接
链接失效反馈
官方服务:
资源简介:
Descripción: This repository contains the data and code supporting the research article "Data mining models to predict timber production across Colombian departments", developed as a master's thesis at Universidad Cooperativa de Colombia. The study applies machine learning and time series techniques to forecast quarterly timber mobilization volumes at the departmental level in Colombia, following the CRISP-DM methodology in R/RStudio 4.4.1. File 1 – Base_de_datos_relacionada_con_madera_movilizada_proveniente_de_Plantaciones_Forestales_Comerciales.xlsx: Raw open-access database published by the Colombian Agricultural Institute (ICA), retrieved from the Colombian Open Data Portal (datos.gov.co). Contains 53,856 records across 9 variables: year, semester, quarter, department, municipality, timber species, product type, data source, and mobilized volume (m³). Covers 28 departments, 699 municipalities, and 144 timber species between 2012 and 2022. Provided without modifications, preserving the original structure as downloaded from the official source. File 2 – tesis.R: Fully commented R script (795 lines) implementing the complete analytical pipeline: data cleaning and preprocessing, spatio-temporal analysis with annual choropleth maps (GADM cartography), missing value imputation using KSSA with automatic best-fit selection (departments with >30% missing values excluded), and predictive modeling fitting five algorithms per department (ARIMA, Prophet, GLMNET, Random Forest, Prophet Boost) with a 90/10 temporal split. Models are evaluated using RMSE, MAE, MAPE, SMAPE, and MASE, and four-quarter-ahead forecasts are generated for each department. Main packages: tidymodels, modeltime, kssa, ggplot2, sf, ranger. Keywords: Timber production; Colombia; Data mining; Machine learning; ARIMA; Random Forest; CRISP-DM; Time series; Open data; R.
创建时间:
2026-05-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作