five

Supporting data for "Katdetectr: An R/Bioconductor package utilizing unsupervised changepoint analysis for robust kataegis detection"

收藏
DataCite Commons2025-05-26 更新2024-07-13 收录
下载链接:
http://gigadb.org/dataset/102445
下载链接
链接失效反馈
官方服务:
资源简介:
Kataegis refers to the occurrence of regional genomic hypermutation in cancer and is a phenomenon that has been observed in a wide range of malignancies. A kataegis locus constitutes a genomic region with a high mutation rate, i.e., a higher frequency of closely interspersed somatic variants than the overall mutational background. It has been shown that kataegis is of biological significance and possibly clinically relevant. Therefore, an accurate and robust workflow for kataegis detection is paramount. <br>Here we present <i>Katdetectr</i>, an open-source R/Bioconductor-based package for the robust yet flexible and fast detection of kataegis loci in genomic data. In addition, <i>Katdetectr</i> houses functionalities to characterize and visualize kataegis and provides results in a standardized format useful for subsequent analysis. In brief, <i>Katdetectr</i> imports industry-standard formats (MAF, VCF, and VRanges), determines the intermutation distance of the genomic variants and performs unsupervised changepoint analysis utilizing the Pruned Exact Linear Time search algorithm followed by kataegis calling according to user-defined parameters. <br>We used synthetic data and an <i>a priori</i> labeled pan-cancer dataset of Whole Genome Sequenced malignancies for the performance evaluation of Katdetectr and five publicly available kataegis detection packages. Our performance evaluation shows that <i>Katdetectr</i> is robust regarding tumor mutational burden (TMB) and shows the fastest mean computation time. Additionally, Katdetectr reveals the highest accuracy (0.99, 0.99) and normalized Matthews Correlation Coefficient (0.98, 0.92) of all evaluated tools for both datasets. <br><i>Katdetectr</i> is a robust workflow for the detection, characterization, and visualization of kataegis and is available on Bioconductor: https://doi.org/doi:10.18129/B9.bioc.katdetectr
提供机构:
GigaScience Database
创建时间:
2023-08-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作