five

k-Prototypes Clustering Algorithm

收藏
DataCite Commons2025-04-01 更新2025-04-16 收录
下载链接:
https://data.mendeley.com/datasets/63nyn9tjcd
下载链接
链接失效反馈
官方服务:
资源简介:
The functions used to carry out this work are found in the files provided, "k-Prototypes Clustering" and "clustMixType modified functions". These algorithms carry out the operations of obtaining and manipulating the data matrix, descriptive statistics of the data, determining the best number of clusters, clustering with the k-prototypes method, and statistical validation of the generated clusters with MANOVA. An example is also presented using the Iris database, contained in the R software library, and widely used to exemplify and validate algorithms developed in R language. The functions modified for this work are found in the files "clustMixType modified functions". The modified functions are called in the algorithm of the file "k-Prototypes Clustering", on line 41, by the file "k-Prototypes Clustering.R". The kproto.modif (), clprofiles.modif () and summary.kproto.modif () functions were modified from the kproto (), clprofiles () and summary.kproto () functions, respectively, of the clustMixType package, developed by SZEPANNEK (2018). The dist.binary () function of the ade4 package, developed by DRAY & DUFOUR (2017), was also used in the development of the kproto.modif () function, that now can use a variety of similarity functions. The relationship between the variables is expressed by the squared Euclidean distance, to quantify the distance between numerical variables, and for the nominal variables, the distance can be obtained from a variety of coefficients of similarity. The fviz_cluster.modif () function was modified from the fviz_cluster () function of the factoextra package, developed by KASSAMBARA & MUNDT (2017). REFERENCES: - DRAY, S.; DUFOUR, A.-B. The ade4 Package: Implementing the Duality Diagram for Ecologists. Journal of Statistical Software, v.22, n.4, p.1-20, set. 2017. R Package version 1.7-13. Available at: https://CRAN.R-project.org/package=ade4. https://www.doi.org/10.18637/jss.v022.i04. - KASSAMBARA, A.; MUNDT, F. factoextra: Extract and Visualize the Results of Multivariate Data Analyses. 2017. R Package version 1.0.5. Available at: https://CRAN.R-project.org/package=factoextra. - SZEPANNEK, G. clustMixType: User-Friendly Clustering of Mixed-Type Data in R. The R Journal, v.10, n.2, p.200-208, 2018. R Package version 0.2-1. Available at: https://CRAN.R-project.org/package=clustMixType. https://www.doi.org/10.32614/RJ-2018-048.
提供机构:
Mendeley
创建时间:
2020-01-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作