five

Local LD index in HSV-1 genome alignments: 7 CSF samples, 11 SWAB samples and 18 combined samples.

收藏
Mendeley Data2024-01-31 更新2024-06-27 收录
下载链接:
https://figshare.com/articles/dataset/Local_LD_index_in_HSV-1_genome_alignments_7_CSF_samples_11_SWAB_samples_and_18_combined_samples_/8088635/1
下载链接
链接失效反馈
官方服务:
资源简介:
To screen for potential localised genomic recombination, we performed a site-by-site LD analysis within 2000 and 3000bp sliding windows using the "genome-wide_LD_scan.r" script from the “genomescans” suite. This scan tests all the pairwise associations between polymorphism patterns of a fixed number (20) of evenly spaced biallelic sites within a window using Fisher's exact tests; windows containing less than this number of SNPs were excluded. To identify windows with stronger LD than in average genome-wide, it then compares the distribution of pairwise p-values within the window to the distribution obtained combining all possible windows in the genome using a Mann-Whitney-Wilcoxon test, reporting windows with - log10 p-value as a score, the local LD index (LDI, only scores > 5 were considered significant). This was performed on an alignment of genomes from all samples in this study, as well as on alignments of genomes from the CSF and Swab samples only, treating them as separate populations. To verify that the estimates of linkage in these data subsets reflected a group-specific population structure and not a bias induced by the difference in dataset sizes, we resampled the genomic datasets, drawing 30 pseudorandom combinations of genomes the size of each dataset, sampling equally from CSF and Swab group. This way, we obtained a baseline distribution of local LD at each genome site under the hypothesis of no group-specific population structure; for group-specific subset analyses, only LDI scores falling out of the 95% confidence interval of this simulated distribution were deemed significant. To compare the strength of linkage estimated from the CSF and Swab samples, we normalised the r2 values for CSF and Swab datasets by dividing the r2 at each locus by the median r2 for the opposite dataset. We then computed the difference between normalised r2 values to identify significant windows with a substantial difference of LD strength between CSF and Swab. Genomic windows (and the associated estimated values or r2 and LDI) were assigned to a gene according to the reference position of the centre point of the window.
创建时间:
2024-01-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作