Supplementary information for: NUMT PARSER: Automated identification and removal of nuclear mitochondrial pseudogenes (numts) for accurate mitochondrial genome reconstruction in Panthera
收藏DataCite Commons2026-03-05 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.6t1g1jx33
下载链接
链接失效反馈官方服务:
资源简介:
Nuclear mitochondrial pseudogenes (numts) may hinder the reconstruction of
mtDNA genomes and affect the reliability of mtDNA datasets for
phylogenetic and population genetic comparisons. Here, we present the
program Numt Parser, which allows for the identification of DNA sequences
that likely originate from numt pseudogene DNA. Sequencing reads are
classified as originating from either numt or true cytoplasmic
mitochondrial (cymt) DNA by direct comparison against cymt and numt
reference sequences. Classified reads can then be parsed into cymt or numt
datasets. We tested this program using whole genome shotgun-sequenced data
from two ancient Cape lions (Panthera leo) because mtDNA is often the
marker of choice for ancient DNA studies, and the genus Panthera is known
to have numt pseudogenes. Numt Parser decreased sequence disagreements
that were likely due to numt pseudogene contamination and equalized read
coverage across the mitogenome by removing reads that likely originated
from numts. We compared the efficacy of Numt Parser to two other
bioinformatic approaches that can be used to account for numt
contamination. We found that Numt Parser outperformed approaches that rely
only on read alignment or Basic Local Alignment Search Tool (BLAST)
properties, and was effective at identifying sequences that likely
originated from numts while having minimal impacts on the recovery of cymt
reads. Numt Parser therefore improves the reconstruction of true
mitogenomes, allowing for more accurate and robust biological inferences.
提供机构:
Dryad
创建时间:
2022-12-06



