five

Assessing High-density Y-SNP Panels for Paternal Haplogroup Assignment in Forensic Practice

收藏
中国科学数据2026-03-30 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.3724/j.pibb.2025.0481
下载链接
链接失效反馈
官方服务:
资源简介:
Objective The accuracy of Y-chromosome haplogroup assignment is crucial for tracing paternal lineage in male samples. With the advancement of high-throughput sequencing technologies, high-density Y-SNP genotyping from whole-genome or array-based data has become a standard method for determining Y-chromosome haplogroups. This study systematically evaluated the performance of 4 commonly used high-density SNP genotyping systems—namely, the Global Screening Array (GSA), Chinese Genotyping Array (CGA), Affymetrix array, and the 1240K capture panel—for haplogroup assignment. This work provides a reference for data comparison across different systems. Methods We extracted genotype data for the 4 Y-SNP panels from 30× whole-genome sequencing (WGS) data of 1 590 male samples from the 1000 Genomes Project. Additionally, GSA array genotype data from 384 relative pairs (spanning 1st- to 12th-degree relationships) from 109 Chinese Han families were collected. Haplogroup assignment was performed using Y-LineageTracker v1.3.0 software. We assessed the concordance and resolution of haplogroup assignments between the four Y-SNP panels and the WGS data. The consistency and resolution of haplogroup assignments were also evaluated for both the 1000 Genomes Project samples and the 109 family samples collected in this study. Furthermore, the impact of varying numbers of Y-SNPs on haplogroup assignment was examined. Results The GSA and CGA panels demonstrated superior resolution and discrimination of haplogroup subclades compared with the other two panels. The haplogroup assignments from the GSA, CGA, and 1240K panels showed high concordance with WGS data, with consistency rates exceeding 88.70%, whereas the Affymetrix platform exhibited a significantly lower consistency rate of 61.89%. Specifically, the GSA and CGA panels consistently demonstrated superior performance compared with the other two panels in the assignment of haplogroups O-M175 and H-L901, achieving complete concordance (100%) for both haplogroups. In contrast, the Affymetrix panel erroneously assigned all individuals belonging to haplogroup O-M175 to haplogroup K2-M526. Furthermore, its accuracy for haplogroup H-L901 was exceedingly low, at merely 1.41%. This poor performance was characterized by the misassignment of 98.59% of H-L901 samples—specifically, 1.41% to J-M304 and a predominant 97.18% to F-M89. For haplogroup R-M207, all four panels exhibited uniformly high levels of consistency, with concordance values exceeding 94.00%. Notably, for haplogroup E-M96, the 1240K and Affymetrix panels outperformed the GSA and CGA panels in terms of concordance, representing the first instance in which these two panels surpassed the latter. Conversely, for haplogroups J-M304, Q-M242, and I-M170, all 4 panels showed relatively elevated misclassification rates, with the Affymetrix array demonstrating the poorest overall performance. None of the four panels showed any discordant haplogroup assignments among the familial relative pairs analyzed. A positive correlation was observed between the number of Y-SNPs (ranging from 1 000 to 10 000) and classification consistency; however, classification consistency plateaued when the number of Y-SNPs exceeded 10 000. Furthermore, a random sampling analysis conducted on the GSA and CGA panels demonstrated that the haplogroup misclassification rate exhibited negligible fluctuation across the Y-SNP range of 500 to 1 000. Conversely, a marked enhancement in classification consistency was observed as the number of markers increased from 1 000 to 5 000, ultimately reaching a plateau within the interval of 5 000 to 8 000 markers. Conclusion These findings indicate that the GSA and CGA panels provide high resolution and concordance, delivering reliable Y-haplogroup assignment for forensic investigations.
创建时间:
2026-03-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作