five

Dataset60: High-dimensional Datasets for Feature Selection

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10471604
下载链接
链接失效反馈
官方服务:
资源简介:
Dataset60 comprises 60 high-dimensional datasets sourced from open repositories, systematically curated and formatted to establish a new benchmark for evaluating feature selection algorithms. A summary of each dataset, including its name, number of samples (n_sample), features (n_feature), classes (n_class), and quantity/proportion for each label (label_distribution), is available in the "summary_dataset60.csv" file. Datasets are obtained from various sources: https://jundongl.github.io/scikit-feature/datasets.html https://zenodo.org/records/2709491 https://archive.ics.uci.edu/datasets https://data.mendeley.com/datasets/fhx5zgx2zj/1 https://ckzixf.github.io/dataset.html The preprocessing steps include: Removing an index/id column if present. Encoding labels numerically, starting from 0 (0/1/2/…). Re-naming headers to f{feature_index} & label (f1, f2, …, label). Addressing missing values: Following instructions in the README/Dataset Info if available; otherwise, filling missing values with 0. Important note: The data has NOT undergone standardization.
创建时间:
2024-04-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作