five

Supporting data for "PhaseME: automatic rapid assessment of phasing quality and phasing improvement"

收藏
DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100768
下载链接
链接失效反馈
官方服务:
资源简介:
The detection of what mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the reconstructed haplotypes are hard to assess.<br>Here we present PhaseME, a versatile method to provide insights into and improvement of sample phasing results based on linkage data. We showcase the performance and the importance of PhaseME by comparing phasing information obtained from Pacific Biosciences (PacBio) including both CLR (continuous long reads) and HiFi (high-quality consensus reads), Oxford Nanopore Technologies (ONT) and 10Xgenomics sequencing technologies. We found that 10Xgenomics and ONT phasing can be significantly improved while retaining a high N50 and completeness. PhaseME generates reports and summary plots to provide insights into phasing performance and correctness. We observed unique phasing issues for each of the sequencing technologies, highlighting the necessity of quality assessments. PhaseME is able to decrease the Hamming error rate significantly by 26.2% averaged across all four technologies. Additionally, a significant improvement is obtained in the number of long switch errors especially for HiFi 54.6% with only a 5% decrease in phase block N50.<br>PhaseME is a universal method to assess the phasing quality and accuracy and improves the quality of phasing using linkage information. The package is freely available at in GitHub.
提供机构:
GigaScience Database
创建时间:
2020-07-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作