five

Binary-Classification Performance Metric-Spaces Data

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://data.mendeley.com/datasets/64r4jr8c88
下载链接
链接失效反馈
官方服务:
资源简介:
Metric-Space is a proposed concept by Gürol Canbek et al (2019). A metric-space indicates all possible permutations of contingency table (or confusion matrix) elements yielding the same sample size (Sn). It holds all possible results of a hypothetical classification conducted in a dataset with a given sample size in terms of one or more performance metrics (e.g. Accuracy, F1, or TPR). Metric-space provides a pseudo-universal space to analyze and compare metrics in complete coverage. The formal definition and the details are given in the article. Each data file has the following performance 13 measures and 13 metrics: * True Positive (TP), False Positive (FP), False Negative (FN), True Negative (TN), Positive (P), Negative (N), Outcome Positive (OP), Outcome Negative (ON), True Classification (TC), False Classification (FC), Sample Size (Sn), Prevalence (PREV), Bias (BIAS) * True Positive Rate (TPR), True Negative Rate (TNR), Positive Predictive Value (PPV), Negative Predictive Value (NPV), Accuracy (ACC), Informedness (INFORM), Markedness (MARK), Balanced Accuracy (BACC), G, Normalized Mutual Information (nMI), F1, Cohen’s Kappa (CK), and Mathews Correlation Coefficient (MCC) Each data file belongs to metric-spaces for different Sn values (10, 25, 50, 75, 100, 125, 150, 175, 200, 225). The file format is RData (compatible with The R Project for Statistical Computing) instead of CSV (comma separated values) because of large CSV file sizes. Therefore, MATLAB users should convert the files into CSV and save them in R: > load('MetricSpaces_Sn_10.RData') > metric_spaces_Sn_10 <- data.frame(TP, FP, FN, TN, P, N, OP, ON, TC, FC, Sn, PREV, BIAS, TPR, TNR, PPV, NPV, ACC, INFORM, MARK, BACC, G, nMI, F1, CK, MCC) > write.csv(metric_spaces_Sn_10, file='MetricSpaces_Sn_10.csv') Note that metric-space sizes (permutations) increase exponentially: Sn=25 (3,276); Sn=50 (23,426); Sn=75 (76,076); Sn=100 (176,851); Sn=125 (341,376); Sn=150 (585,276); Sn=175 (924,176); Sn=200 (1,373,701); Sn=250 (2,667,126).
创建时间:
2020-08-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作