cpn60-Classifier v10.1 (Performance testing)
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/cpn60-Classifier_v10_1_Performance_testing_/21972278
下载链接
链接失效反馈官方服务:
资源简介:
cpn60-Classifier v10.1
(For additional information and releases, visit HillLab on github)
This is the version of the RDP Classifier trained on 11,001 reference cpn60 sequences used for performance testing. Duplicate sequences were removed from the reference database using the rm-dupseq function of the RDP classifier since these can inflate results during classification performance testing.
(An updated release containing additional sequences has been made available since this original investigation)
The release contains training files (taxonomy table and FASTA formatted sequences) as well as the trained classifier for use with RDP Classifier.
RDPTools includes the classifier and can be installed with conda https://anaconda.org/bioconda/rdptools (Wang Q, Garrity GM, Tiedje JM, Cole JR. 2007. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 73:5261–7).
Quick start with the trained classifier
Download cpn60-Classifier_v10_trained.tar.gz and unpack it. The resulting directory should include:
bergeyTrainingTree.xml
genus_wordConditionalProbList.txt
logWordPrior.txt
rRNAClassifier.properties
wordConditionalProbIndexArr.txt
A basic command to classify cpn60 sequences contained in a file called queries.fasta:
java -jar /path/to/RDPTools/classifier.jar classify -c 0.9 -f allrank -t /path/to/cpn60-Classifier_v10_trained/rRNAClassifier.properties -o output.txt queries.fasta
See the README here for more details on the RDP Classifier: https://github.com/rdpstaff/classifier
To train the Classifier
Download cpn60-Classifier_v10_training.tar.gz and unpack it. The resulting directory should include:
refseqs_v10.fasta
taxonomytable_v10.txt
Other scripts needed (from https://github.com/GLBRC-TeamMicrobiome/python_scripts with minor edit to addFullLineage.py to fix error):
addFullLineage-jh.py
lineage2taxTrain.py
(If you want to generate your own taxonomy file, see https://pypi.org/project/taxonomy-ranks/)
Make ready-to-train taxonomy:
/path/to/lineage2taxTrain.py taxonomytable_v10.txt > ready2train_taxonomy.txt
Add lineages to fasta sequence definition lines:
/path/to/addFullLineage-jh.py taxonomytable_v10.txt resets_v10.fasta > ready2train_refseqs.fasta
Now train:
java -jar /path/to/RDPTools/classifier.jar train -o training_files -s read2train_refseqs.fasta -t ready2train_taxonomy.txt
The resulting directory contains the trained classifier EXCEPT for one important thing, which is the rRNAClassifier.properties file, which you can add manually.
创建时间:
2023-04-26



