five

CTISL: a dynamic stacking multi-class classification approach for identifying cell types from single-cell RNA-sequencing data

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10568905
下载链接
链接失效反馈
官方服务:
资源简介:
Effective identification of cell types is crucial in single-cell RNA-sequencing (scRNA-seq) data analysis. While numerous supervised machine learning-based predictors exist for this purpose, most are single classifiers, leaving room for improved performance. Here, we introduce CTISL (Cell Type Identification by Stacking ensemble Learning), a two-layer stacking model that integrates multiple classifiers to robustly and comprehensively identify cell types from scRNA-seq datasets. CTISL dynamically combines cell type-specific classifiers such as support vector machine (SVM) and logistic regression (LR) as base learners in the first layer, whose outcomes are fed into a meta-classifier in the second layer. We conducted 24 benchmarking experiments on 17 human and mouse scRNA-seq datasets, demonstrating CTISL's superior or competitive performance compared to state-of-the-art approaches. The webserver for CTISL can be accessible at http://bigdata.biocie.cn/CTISLweb/home. The source code of CTISL is also available at https://github.com/wx-cie/CTISL/. Requirements Python 3.7.11 scikit-learn 1.0.2 scanpy 1.8.2 numpy 1.19.1 pandas 1.3.5 mlxtend 0.19.0 joblib 1.1.0 Train CTISL on Intra-dataset To train CTISL on the Intra-dataset, use the following command: python Intra_train.py -Name 'CELSeq' -Modelname 'CTISL' -ResultPath './' -Fileform 'h5ad' -Norm True   Parameters: Name: The Intra-dataset name (e.g., 10Xv2, 10Xv3, etc.) Modelname:The model name, which can be either 'CTISL' or 'MLP'. ResultPath:Please enter the folder address where the model's predicted results will be saved. Fileform: The gene expression matrix file format, which can be either 'csv' or 'h5ad'. Norm: Specify whether the raw data needs to be normalized. Use 'True' for normalization. Train CTISL on Inter-data, Cross-batch, Cross-species To train CTISL on Inter-data, Cross-batch, or Cross-species, use the following command: python cross_train.py -Sourcename 'Dendritic_batch1' -Targetname 'Dendritic_batch2' -ResultPath './' -Modelname 'CTISL' -Fileform 'csv' -Norm False Parameters: Sourcename: The name of the training set. For example, 'dendritic_batch_1' is used as the training set in the cross-batch experiment. Targetname: The name of the testing set. For example, 'dendritic_batch_2' is used as the testing set in the cross-batch experiment. ResultPath:Please enter the folder address where the model's predicted results will be saved. Modelname:The model name, which can be either 'CTISL' or 'MLP'. Fileform: The gene expression matrix file format, which can be either 'csv' or 'h5ad'. Norm: Specify whether the raw data needs to be normalized. Use 'True' for normalization. Run the code on your data If you want to train and predict with your data, please use the following command. python demo_train.py -Train [*The path to your training dataset*] -Trainlabel [*The path to your training dataset labels*] \ -Test [*The path to your testing dataset*] -Predictlabel [*Location for storing predicted labels*]\ -Modelname [*The model name] -Fileform [*Your data format*] -Norm [*Is the data standardized?*] Web Server We also offer a user-friendly web server website where users can directly upload data for training and prediction ! link
创建时间:
2024-01-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作