five

Local LD index in HSV-1 genome alignments: 7 CSF samples, 11 SWAB samples and 18 combined samples.

收藏
DataCite Commons2022-04-27 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/dataset/Local_LD_index_in_HSV-1_genome_alignments_7_CSF_samples_11_SWAB_samples_and_18_combined_samples_/8088635/1
下载链接
链接失效反馈
官方服务:
资源简介:
To screen for potential localised genomic recombination, we performed a site-by-site LD analysis within 2000 and 3000bp sliding windows using the "genome-wide_LD_scan.r" script from the “genomescans” suite. This scan tests all the pairwise associations between polymorphism patterns of a fixed number (20) of evenly spaced biallelic sites within a window using Fisher's exact tests; windows containing less than this number of SNPs were excluded. To identify windows with stronger LD than in average genome-wide, it then compares the distribution of pairwise p-values within the window to the distribution obtained combining all possible windows in the genome using a Mann-Whitney-Wilcoxon test, reporting windows with - log<sub>10</sub> p-value as a score, the local LD index (LDI, only scores &gt; 5 were considered significant). This was performed on an alignment of genomes from all samples in this study, as well as on alignments of genomes from the CSF and Swab samples only, treating them as separate populations. To verify that the estimates of linkage in these data subsets<sup> </sup>reflected a group-specific population structure and not a bias induced by the difference in dataset sizes, we resampled the genomic datasets, drawing 30 pseudorandom combinations of genomes the size of each dataset, sampling equally from CSF and Swab group. This way, we obtained a baseline distribution of local LD at each genome site under the hypothesis of no group-specific population structure; for group-specific subset analyses, only LDI scores falling out of the 95% confidence interval of this simulated distribution were deemed significant. To compare the strength of linkage estimated from the CSF and Swab samples, we normalised the r<sup>2</sup> values for CSF and Swab datasets by dividing the r<sup>2</sup> at each locus by the median r<sup>2</sup> for the opposite dataset. We then computed the difference between normalised r<sup>2</sup> values to identify significant windows with a substantial difference of LD strength between CSF and Swab. Genomic windows (and the associated estimated values or r<sup>2</sup> and LDI) were assigned to a gene according to the reference position of the centre point of the window.
提供机构:
figshare
创建时间:
2019-05-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作