five

Gaussian Mixture high-dimensional datasets

收藏
doi.org2025-03-26 收录
下载链接:
http://doi.org/10.17632/xj9vyybgzn.1
下载链接
链接失效反馈
官方服务:
资源简介:
Seven sets of c Gaussian shaped clustered datasets. For each dataset, n points with p dimensions were generated from a mixture of c Gaussian distributions (clusters). The means of each cluster were randomly generated with numbers between 0 and 10 and they were structured in a c x p matrix. The standard deviations of each cluster are represented by a p x p covariance matrix generated by a normally distributed random numbers. After all, from the matrix of means c x p and the c covariance matrices p x p, the mvrnorm function of the MASS library of the R software was used to produce the samples that composes the Gaussian mixture dataset. The properties of the seven Gaussian mixture datasets: Dataset | p | n | c Gaussian.k8 | 657 | 81 | 8 Gaussian.k2 | 4,232 | 181 | 2 Gaussian.k7 | 4,514 | 128 | 7 Gaussian.k9 | 5,041 | 108 | 9 Gaussian.k6 | 5,176 | 143 | 6 Gaussian.k5 | 6,203 | 130 | 5 Gaussian.k4 | 6,615 | 168 | 4 The cluster label of each object is in the last column of the dataset. Each column is separated by a comma and there are not columns and row names.

本数据集由七组高斯形状的聚类数据集构成。针对每一数据集,从由c个高斯分布(聚类)混合而成的空间中,生成了n个具有p维度的点。每个聚类的均值随机生成,其数值介于0至10之间,并以c x p矩阵的形式排列。每个聚类的标准差由一个p x p的协方差矩阵表示,该矩阵由正态分布的随机数生成。最终,通过R软件MASS库中的mvrnorm函数,利用均值矩阵c x p和c个协方差矩阵p x p,生成了构成高斯混合数据集的样本。 七组高斯混合数据集的特性如下: 数据集 | 维度p | 点数n | 聚类数c --- | --- | --- | --- Gaussian.k8 | 657 | 81 | 8 Gaussian.k2 | 4,232 | 181 | 2 Gaussian.k7 | 4,514 | 128 | 7 Gaussian.k9 | 5,041 | 108 | 9 Gaussian.k6 | 5,176 | 143 | 6 Gaussian.k5 | 6,203 | 130 | 5 Gaussian.k4 | 6,615 | 168 | 4 每个对象的聚类标签位于数据集的最后一列。各列之间以逗号分隔,且数据集中不包含列名和行名。
提供机构:
doi.org
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作