five

mbX Pro -- pre-trained Greengenes2 Naive-Bayes classifiers (QIIME2 2023.2 - 2025.4)

收藏
DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20021034
下载链接
链接失效反馈
官方服务:
资源简介:
# mbX Pro -- pre-trained Greengenes2 Naive-Bayes classifiers This bundle contains 8 pre-trained full-length **Greengenes2 Naive-Bayestaxonomic classifiers**, one per QIIME2 release from 2023.2 through 2025.4.They are designed to be downloaded by the [mbX Pro pipeline](https://github.com/) (or usedmanually with QIIME2) so that end users do not have to spend 30-90 minutestraining a classifier locally before classifying 16S rRNA reads. Trained on **macOS, 2026-05-01**, using each QIIME2 version's pinned`scikit-learn` build to guarantee pickle compatibility. --- ## What problem does this solve? QIIME2's `qiime feature-classifier classify-sklearn` needs a trainedclassifier (a `.qza` artifact wrapping a `sklearn.naive_bayes.MultinomialNB`pickle plus its taxonomy reference). Training one on the full Greengenes2backbone takes **30-90 minutes** on a laptop and **8-32 GB RAM**, and itmust be re-trained every time you upgrade QIIME2 because sklearn picklesare not stable across major sklearn versions. Pre-training them once and hosting them as a download cuts that to a**~30-second download** every time. --- ## Which file do I need? Match your **QIIME2 release** to the matching classifier file: | Your QIIME2 release | Use this file | GG2 ref | sklearn ||---------------------|---------------|---------|---------|| 2023.2              | `gg2-2022.10-full-length-naive-bayes-qiime2-2023.2.qza`  | 2022.10 | 0.24.1 || 2023.5              | `gg2-2022.10-full-length-naive-bayes-qiime2-2023.5.qza`  | 2022.10 | 0.24.1 || 2023.7              | `gg2-2022.10-full-length-naive-bayes-qiime2-2023.7.qza`  | 2022.10 | 0.24.1 || 2023.9              | `gg2-2022.10-full-length-naive-bayes-qiime2-2023.9.qza`  | 2022.10 | 0.24.1 || 2024.2              | `gg2-2022.10-full-length-naive-bayes-qiime2-2024.2.qza`  | 2022.10 | 0.24.1 || 2024.5              | `gg2-2024.09-full-length-naive-bayes-qiime2-2024.5.qza`  | 2024.09 | 1.4.2  || 2024.10             | `gg2-2024.09-full-length-naive-bayes-qiime2-2024.10.qza` | 2024.09 | 1.4.2  || 2025.4              | `gg2-2024.09-full-length-naive-bayes-qiime2-2025.4.qza`  | 2024.09 | 1.4.2  | **Why two GG2 references?** QIIME2 versions before 2024.5 cannot read the2024.09 GG2 backbone (different artifact schema), so older QIIME2 releasesget the 2022.10 GG2; newer releases get the latest 2024.09 GG2. **What if my QIIME2 version isn't in this list?** Use the closest releasedversion's classifier (matching `sklearn` major version) -- it will load.Failing that, train your own with`qiime feature-classifier fit-classifier-naive-bayes`. --- ## How to use the file ```bash# 1.  Download the matching .qzacurl -L -o gg2-classifier.qza <link that (url in my email ul54354@gmail.com)> # 2.  (Optional) verify SHA-256 against MANIFEST.tsvshasum -a 256 gg2-classifier.qza | awk '{print $1}'grep gg2-classifier MANIFEST.tsv | awk -F$'\t' '{print $6}' # 3.  Classify your representative sequencesqiime feature-classifier classify-sklearn \  --i-classifier gg2-classifier.qza \  --i-reads representative_sequences.qza \  --o-classification taxonomy.qza \  --p-n-jobs 4``` If you are running the mbX Pro pipeline, the appropriate file isdownloaded automatically based on your QIIME2 version. You do **not**need to download it manually.
提供机构:
Zenodo
创建时间:
2026-05-04
二维码
社区交流群
二维码
科研交流群
商业服务