Evaluation of recombination detection methods for viral sequencing
收藏DataCite Commons2025-05-01 更新2025-04-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.d7wm37q6f
下载链接
链接失效反馈官方服务:
资源简介:
Recombination is a key evolutionary driver in shaping novel viral
populations and lineages. When unaccounted for, recombination can impact
evolutionary estimations, or complicate their interpretation. Therefore,
identifying signals for recombination in sequencing data is a key
prerequisite to further analyses. A repertoire of recombination detection
methods have been developed over the past two decades, however, the
prevalence of pandemic-scale viral sequencing data poses a computational
challenge for existing methods. Here, we assessed five recombination
detection methods (PhiPack (Profile), 3SEQ, GENECONV, VSEARCH (UCHIME),
and gmos) to determine if any are suitable for the analysis of bulk
sequencing data. To test the performance and scalability of these methods,
we analysed simulated viral sequencing data across a range of sequence
diversities, recombination frequencies, and sample sizes. Further, we
provide a practical example for the analysis and validation of empirical
data. We find that recombination detection methods need to be scalable,
use an analytical approach and resolution that is suitable for the
intended research application, and are accurate for the properties of a
given dataset (e.g. sequence diversity and estimated recombination
frequency). Analysis of simulated and empirical data revealed that the
assessed methods exhibited considerable trade-offs between these criteria.
Overall, we provide general guidelines for the validation of recombination
detection results, the benefits and shortcomings of each assessed method,
and future considerations for recombination detection methods for the
assessment of large-scale viral sequencing data.
提供机构:
Dryad
创建时间:
2023-11-27



