five

Comparative evaluation of genotyping technologies for investigative genetic genealogy in sexual assault casework

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.g1jwstr04
下载链接
链接失效反馈
官方服务:
资源简介:
Investigative Genetic Genealogy (IGG) offers a capability to identify investigative leads when CODIS searching is unproductive, and IGG can provide time efficient methods for removing perpetrators of serial violent crimes, such as rape and murder from the community, thereby increasing public safety. However, use of IGG has preceded establishment of best practices. The 2021 TWG operational requirements identified the need for further development, assessment, and evaluation of IGG testing procedures for use by crime labs. This study will support the TWG requirements by assessing the ability of genotyping technologies to develop useful profiles from low-template and degraded sexual assault samples for genealogical searching in law enforcement accessible Direct-to-Consumer (DTC) genealogical databases and support rapid, accurate, efficient identification of the samples’ source. In Phase I, genotyping by Illumina Global Screening Array BeadChip, WGS on NovaSeq 6000, and targeted sequencing with Verogen ForenSeq Kintelligence Kit will be compared for sensitivity to low-level DNA input concentrations and specificity for artificially degraded DNA using whole semen and nascent semen DNA samples. The high-density SNP genotype profiles will be compared against databased genotypes in order to determine the maximum distance at which known or potential genealogical associations can be identified. In Phase II, the limitations will be further tested by generating a mock case scenario with laboratory-created challenging samples exhibiting both low-level concentration and DNA degradation utilizing a known donor for whom verified family members of known relationship distance greater than 5th degree (first cousin once removed) are present in DTC databases. After genotyping mock samples with methods most applicable to each sample’s particular characteristics, as determined by Phase I evaluations, a full genealogical investigative workflow conforming to the Genealogical Proof Standard will be applied to demonstrate whether or not increasingly distant relatives can be identified and at what distance identification is no longer possible. Dissemination of the results of this study will provide the community with much needed systematic analyses and direct comparisons of available technologies and allow practitioners to make more informed decisions when working with limited resources. Results may assist in developing lab-specific criteria for processing irreplaceable DNA evidence samples with IGG and development of new, more efficient genealogical workflows. Study results will be disseminated to the forensic community through publications in peer-reviewed journals and presentations at scientific meetings. Methods DNA extracts were genotyped by Illumina Global Screening Array BeadChip, whole genome sequencing (WGS) on NovaSeq 6000, and targeted sequencing with Qiagen/Verogen ForenSeq Kintelligence Kit to evaluate the technology-specific sensitivity to low-level DNA input concentrations and specificity for artificially degraded DNA using whole semen and nascent semen DNA samples. To generate genotype call metrics using Qiagen ForenSeq Kintelligence, sequence libraries were prepared following manufacturer’s recommended protocol for library preparation, library prep quality control, and MiSeq FGx sequencing. Raw sequence processing and genotype calling was performed in Qiagen/Verogen Universal Analysis Software with a 1.5% (10X coverage) analytical threshold. To generate genotype call metrics using genome sequencing, dsDNA libraries were prepared using internally optimized, proprietary protocol, pooled, and sequenced on the Illumina NovaSeq 6000 in 2x150 reads to a target depth of 30X coverage. Read quality filtering, mapping, alignment to hg38_hs38DH reference genome, and allele calling was processed with DRAGEN™ (Illumina, San Diego, CA) v07.021.609.3.9.3, kernel release 3.10.0-1160.42.2.el7.x86_64. Customized scripts were used to parse whole genome allele calls to a target set of 2,061,275 SNPs and format genotypes for GEDmatch upload. To generate genotype call metrics using Illumina Global Screening Array v2, an internally optimized workflow was followed at Gene by Gene (Houston, TX) to hybridize DNA extracts to a custom built GSAv2 BeadChip. Scanning was performed on the Illumina iScan. Genotype calling was performed in GenomeStudio® v2.0 Genotyping Module. Customized Excel workbooks and SQL database were created to compare genotypes across technologies/determine concordant genotypes, calculate call rates, and determine heterozygosity of autosomal loci. Technology-specific processing quality metrics were compiled for all samples, and SNP genotyping results were evaluated for call rate and concordance to the known reference. GEDmatch PRO sample metrics were generated by uploading formatted genotype files to through the GEDmatch PRO portal. GSAv2 genotype files and genome sequencing genotype files were matched to the database using the One-to-Many Segment Based algorithm. Kintelligence genotype files were matched to the database using the One-to-Many Kinship algorithm. Usage notes: Data quality metrics and SNP genotyping metrics are compiled in .xlsx or .csv tables for each of the three technologies. Sensitivity and degraded sample results are provided in separate files per technology. Neither SNP genotypes nor sequence data are not provided to maintain donor privacy.
创建时间:
2024-11-12
二维码
社区交流群
二维码
科研交流群
商业服务