Supporting data for "PhaseME: automatic rapid assessment of phasing quality and phasing improvement"
收藏DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100768
下载链接
链接失效反馈官方服务:
资源简介:
The detection of what mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the reconstructed haplotypes are hard to assess.<br>Here we present PhaseME, a versatile method to provide insights into and improvement of sample phasing results based on linkage data. We showcase the performance and the importance of PhaseME by comparing phasing information obtained from Pacific Biosciences (PacBio) including both CLR (continuous long reads) and HiFi (high-quality consensus reads), Oxford Nanopore Technologies (ONT) and 10Xgenomics sequencing technologies. We found that 10Xgenomics and ONT phasing can be significantly improved while retaining a high N50 and completeness. PhaseME generates reports and summary plots to provide insights into phasing performance and correctness. We observed unique phasing issues for each of the sequencing technologies, highlighting the necessity of quality assessments. PhaseME is able to decrease the Hamming error rate significantly by 26.2% averaged across all four technologies. Additionally, a significant improvement is obtained in the number of long switch errors especially for HiFi 54.6% with only a 5% decrease in phase block N50.<br>PhaseME is a universal method to assess the phasing quality and accuracy and improves the quality of phasing using linkage information. The package is freely available at in GitHub.
提供机构:
GigaScience Database
创建时间:
2020-07-01



