House mouse Mus musculus dispersal in East Eurasia inferred from 98 newly determined complete mitochondrial genome sequences

NIAID Data Ecosystem2026-03-12 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.zkh189384

下载链接

链接失效反馈

官方服务：

资源简介：

The Eurasianhouse mouse Mus musculusis useful for tracing prehistorical human movement related to the spread of farming. We determined whole mitochondrial DNA (mtDNA) sequences (ca. 16,000 bp) of 98 wild-derived individuals of two subspecies, M. m. musculus (MUS) and M. m. castaneus (CAS). We revealed directional dispersals reaching as far asthe Japanese Archipelagofrom their homelands. Our phylogenetic analysis indicated that the eastward movement of MUS was characterised by five step-wise regional extension events: 1) broad spatial expansion into eastern Europe and the western part of western China, 2) dispersal to the eastern part of western China, 3) dispersal tonorthern China, 4) dispersal to the Korean Peninsula and 5) colonisation and expansion in the Japanese Archipelago. These events were estimated to have occurred during the last 2,000–18,000 years. The dispersal of CAS was characterised by three events: initial divergences(ca. 7,000–9,000 years ago) of haplogroups in northern most China and the eastern coast of India, followed by two population expansion events that likely originated from the Yangtze River basinto broad areas of South and Southeast Asia, including Sri Lanka, Bangladesh and Indonesia (ca. 4,000–6,000 years ago) and to Yunnan, southern China and the Japanese Archipelago (ca. 2,000–3,500). This study provides a solid framework for the spatiotemporal movementof the human-associated organisms in Holocene Eastern Eurasia using whole mtDNA sequences, reliable evolutionary rates and accurate branching patterns. The information obtained here contributes to the analysis of a variety of animals and plants associated with prehistoric human migration. Methods Materials We used a total of 98 house mouse samples in this study. Most of our samples overlap with those used by Suzuki et al. (2013). See Li et al. (in press) for the localities where the samples were collected samples codes. DNA extraction and variant calling We determined the whole mitogenome sequences of the 98 house mouse samples(ca. 16,000 bp). Our samples, along with their qualified concentrations and volumes, were sent to BGI (Shenzhen, China) forwhole-genome sequencing. Librarieswere constructed for each sample with index sequences, and paired-end reads of 100 bp were sequenced using the BGISEQ-500 platformby BGI. For each sample, ~1 billion clean reads were obtained. We mapped the raw reads to the GRCm38 (mm10) house mouse reference genome sequence,including the mitogenome,using the BWA-MEM method (Li and Durbin 2009) with the ‘-M’ command option. Samblaster(Faust and Hall 2014) (https://github.com/GregoryFaust/samblaster) with the ‘-M’ command option was used for identifying duplicates in read-id groups for exclusion from downstream analysis. The average median coverage of the whole genome sequence was 30.4 per sample. When reads were simultaneously mapped to the nuclear genome and mitogenome, the reads mapped to regions of the mitogenome that were highly similar or identical to regions in the nuclear genome, due to nuclear mtDNA segments, and yielded very low mapping quality (MQ) scores. To recalibrate the MQ score, we remapped all mapped mitogenomereads to the C57BL6 complete mitogenome (NC_005089.1) using BWA-MEM and recalculated the MQ scores.Single-nucleotide polymorphisms and indels were obtained using the GATK4 HaplotypeCaller program (Mckenna et al. 2010) following the 'Best Practice' pipeline instructions. Each gVCF file was merged using GenotypeGVCFs tosimultaneously call the genotypes of all samples. To identify low-depth uncalled sites, we created a consensus sequence in FASTA format using bcftools consensus with the '-M' option to determine missing genotypes.

创建时间：

2020-08-31