mbX Pro -- pre-trained Greengenes2 Naive-Bayes classifiers (QIIME2 2023.2 - 2025.4)
收藏DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20021034
下载链接
链接失效反馈官方服务:
资源简介:
# mbX Pro -- pre-trained Greengenes2 Naive-Bayes classifiers
This bundle contains 8 pre-trained full-length **Greengenes2 Naive-Bayestaxonomic classifiers**, one per QIIME2 release from 2023.2 through 2025.4.They are designed to be downloaded by the [mbX Pro pipeline](https://github.com/) (or usedmanually with QIIME2) so that end users do not have to spend 30-90 minutestraining a classifier locally before classifying 16S rRNA reads.
Trained on **macOS, 2026-05-01**, using each QIIME2 version's pinned`scikit-learn` build to guarantee pickle compatibility.
---
## What problem does this solve?
QIIME2's `qiime feature-classifier classify-sklearn` needs a trainedclassifier (a `.qza` artifact wrapping a `sklearn.naive_bayes.MultinomialNB`pickle plus its taxonomy reference). Training one on the full Greengenes2backbone takes **30-90 minutes** on a laptop and **8-32 GB RAM**, and itmust be re-trained every time you upgrade QIIME2 because sklearn picklesare not stable across major sklearn versions.
Pre-training them once and hosting them as a download cuts that to a**~30-second download** every time.
---
## Which file do I need?
Match your **QIIME2 release** to the matching classifier file:
| Your QIIME2 release | Use this file | GG2 ref | sklearn ||---------------------|---------------|---------|---------|| 2023.2 | `gg2-2022.10-full-length-naive-bayes-qiime2-2023.2.qza` | 2022.10 | 0.24.1 || 2023.5 | `gg2-2022.10-full-length-naive-bayes-qiime2-2023.5.qza` | 2022.10 | 0.24.1 || 2023.7 | `gg2-2022.10-full-length-naive-bayes-qiime2-2023.7.qza` | 2022.10 | 0.24.1 || 2023.9 | `gg2-2022.10-full-length-naive-bayes-qiime2-2023.9.qza` | 2022.10 | 0.24.1 || 2024.2 | `gg2-2022.10-full-length-naive-bayes-qiime2-2024.2.qza` | 2022.10 | 0.24.1 || 2024.5 | `gg2-2024.09-full-length-naive-bayes-qiime2-2024.5.qza` | 2024.09 | 1.4.2 || 2024.10 | `gg2-2024.09-full-length-naive-bayes-qiime2-2024.10.qza` | 2024.09 | 1.4.2 || 2025.4 | `gg2-2024.09-full-length-naive-bayes-qiime2-2025.4.qza` | 2024.09 | 1.4.2 |
**Why two GG2 references?** QIIME2 versions before 2024.5 cannot read the2024.09 GG2 backbone (different artifact schema), so older QIIME2 releasesget the 2022.10 GG2; newer releases get the latest 2024.09 GG2.
**What if my QIIME2 version isn't in this list?** Use the closest releasedversion's classifier (matching `sklearn` major version) -- it will load.Failing that, train your own with`qiime feature-classifier fit-classifier-naive-bayes`.
---
## How to use the file
```bash# 1. Download the matching .qzacurl -L -o gg2-classifier.qza <link that (url in my email ul54354@gmail.com)>
# 2. (Optional) verify SHA-256 against MANIFEST.tsvshasum -a 256 gg2-classifier.qza | awk '{print $1}'grep gg2-classifier MANIFEST.tsv | awk -F$'\t' '{print $6}'
# 3. Classify your representative sequencesqiime feature-classifier classify-sklearn \ --i-classifier gg2-classifier.qza \ --i-reads representative_sequences.qza \ --o-classification taxonomy.qza \ --p-n-jobs 4```
If you are running the mbX Pro pipeline, the appropriate file isdownloaded automatically based on your QIIME2 version. You do **not**need to download it manually.
提供机构:
Zenodo
创建时间:
2026-05-04



