Fitness landscapes of human microsatellites

DataONE2025-06-30 更新2025-07-19 收录

下载链接：

https://search.dataone.org/view/sha256:c83924830bb9e1c1215b15065bc6851f143ac38dfc7f828656af0e1d7ae623c0

下载链接

链接失效反馈

官方服务：

资源简介：

Advances in DNA sequencing technology and computation now enable genome-wide scans for natural selection to be conducted on unprecedented scales. By examining patterns of sequence variation among individuals, biologists are identifying genes and variants that affect fitness. Despite this progress, most population genetic methods for characterizing selection assume that variants mutate in a simple manner and at a low rate. Because these assumptions are violated by repetitive sequences, selection remains uncharacterized for an appreciable percentage of the genome. To meet this challenge, we focus on microsatellites, repetitive variants that mutate orders of magnitude faster than single nucleotide variants, can harbor substantial variation, and are known to influence biological function in some cases. We introduce four general models of natural selection that are each characterized by just two parameters, are easily simulated, and are specifically designed for microsatellites. Using a rand..., Genotypes: Primers were developed by Prevention Genetics (Marshfield, WI). Prevention Genetics also performed electrophoretic determination of fragment lengths. We determined absolute repeat legnths of microsatellites through comparison of raw genotypes to those available on the 1000 Genomes website. See Supplementary Information for a detailed example of this translation from raw to absolute genotypes. Code: Developed solely by the authors. , , # Fitness landscapes of human microsatellites repository [https://doi.org/10.5061/dryad.sbcc2frg6](https://doi.org/10.5061/dryad.sbcc2frg6) ## Description of repository contents and their use This repository is associated with the article Haasl RJ and Payseur BA (2024) Fitness landscapes of human microsatellites. *PLoS Genetics*, 20:e1011524. The empirical data set includes genotype data for 200 humans originating from the following eight 1000 Genomes populations. 1000 Genomes populations included in the empirical data * CEU: CEPH/Utah (n=25) * CHB: Han Chinese in Beijing, China (n=27) * FIN: Finnish in Finland (n=30) * GIH: Gujarati Indian in Houston, TX (n=25) * LWK: Luhya in Webuye, Kenya (n=25) * MXL: Mexican ancestry in Los Angeles (n=18) * TSI: Toscani in Italy (n=25) * YRI: Yoruba in Ibadan (n=25) Using a method described in Supplementary file *S1 Text* of the publication, we converted genotypes assayed as fragment lengths in units of base pairs (raw genotypes) into allele ..., We provide microsatellite genotype data for a subset of individuals in the 1000 Genomes Project, which were derived from DNA ordered from the Coriell NHGRI Sample Repository. The DNA we worked with was only associated with the 1000 Genomes Project ID of the individual from whom the DNA was derived. Data from the 1000 Genomes Project are fully de-identified. Each sample is referenced only by a non-identifiable sample ID (e.g., NA17171), and no personal, clinical, or geolocation data are associated. Only minimal, high-level population metadata is included (e.g., CEU, YRI). Furthermore, the 1000 Genomes Project was reviewed and approved by appropriate IRBs and ethics boards, with explicit measures to protect participant privacy and reduce re-identification risk. Furthermore, 1000 Genomes Project data are released into the public domain and are explicitly available without restriction or requirement for attribution, in alignment with the Fort Lauderdale and Toronto principles. This satis...

创建时间：

2025-07-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集