five

Supplement 1. R code and the data set necessary to conduct the Random Forest analysis.

收藏
NIAID Data Ecosystem2026-03-09 收录
下载链接:
https://figshare.com/articles/dataset/Supplement_1_R_code_and_the_data_set_necessary_to_conduct_the_Random_Forest_analysis_/3520547
下载链接
链接失效反馈
官方服务:
资源简介:
File List dreissena_in_lakes_of_belarus.csv (MD5: 3dc2d2f89af3064223358983c785771d) r_script_random_forest.R (MD5: af1295890d60bc832955e940889e4575) Description This Supplementary material contains two files necessary to fully reproduce the results obtained using the Random Forest classifier. The first of these files, dreissena_in_lakes_of_belarus.csv, is a plain text table that has 553 records, each described with the following variables: 1. Lake_Code: numeric codes uniquely identifying each lake (for reference only, not used in analysis explicitely). 2. ZMpresence: indicator of whether a lake is infested with zebra mussel (0 – for non-infested, 1 – for infested). 3. LAREA: lake area 4. LVOL: lake volume 5. MAXD: maximal depth 6. AVED: average depth 7. SPECWATSHED: specific watershed (i.e., drainage area) 8. TRANSP: Secci depth 9. COLOR: water color 10. pH: water pH 11. HCO3: HCO3 content 12. SO4: SO4 content 13. Cl: CL content 14. Ca: Ca content 15. Mg: Mg content 16. TDS: total dissolved solids 17: Fe: Fe content 18. Si: Si content 19. NH4: NH4 content 20. NO2: NO2 content 21. PO4: PO4 content 22. PermOx: permanganate oxydizability 23. N: latitude (decimal degree) 24: E: longitude (decimal degree) Missing values in the data set are denoted as NA. The second file, r_script_random_forest.R, loads the data into R (assuming that the file dreissena_in_lakes_of_belarus.csv is stored in the current R working directory), fits the Random Forest model, and plots the results. The analysis relies on three add-on packages: caret, geosphere, randomForest, and ggplot2. All these packages are assumed to be already installed on the user's computer (if not, they can be freely downloaded from the Comprehensive R Archive Network, cran.r-project.org, or installed directly from within R using the following command: install.packages(c("caret", "geosphere", "randomForest", "ggplot2"))).
创建时间:
2016-08-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作