Data Sheet 1_The classification method of donkey breeds based on SNPs data and machine learning.csv
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Data_Sheet_1_The_classification_method_of_donkey_breeds_based_on_SNPs_data_and_machine_learning_csv/28757729
下载链接
链接失效反馈官方服务:
资源简介:
A method for accurately classifying donkey breeds has been developed by integrating single nucleotide polymorphism (SNPs) data with machine learning algorithms. The approach includes preprocessing donkey genomic sequencing data, addressing data imbalance with the Synthetic Minority Over-sampling Technique (SMOTE), and utilizing an improved Leave-One-Out Cross-Validation (LOOCV) for dataset partitioning. Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Random Forest (RF) models were constructed and evaluated. The results demonstrated that different chromosomes significantly influence classifier performance. For instance, chromosome Chr2 showed the highest classification accuracy with KNN, while chromosome Chr19 performed best with SVM and RF models. After enhancing data quality and addressing imbalances, classification performance improved substantially, with accuracy, precision, recall, and F1 score showing increases of up to 15% in certain models, particularly on key chromosomes. This method offers an effective solution for donkey breed classification and provides technical support for the conservation and development of donkey genetic resources.
创建时间:
2025-04-09



