five

Pairwise Distances of EGP Results

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13854867
下载链接
链接失效反馈
官方服务:
资源简介:
Data from the following EGP results on Publicly Available Genomes: Public Dataset Zenodo DOI Simons Genome Diversity Project https://doi.org/10.5281/zenodo.13835663 Human Genome Diversity Project https://doi.org/10.5281/zenodo.13835739 Gambian Genome Variation Project https://doi.org/10.5281/zenodo.13839400 1000G 698 https://doi.org/10.5281/zenodo.13839442 1000G 2504 https://doi.org/10.5281/zenodo.13839949 GIAB Trio https://doi.org/10.5281/zenodo.13839913 was downloaded locally. Analyses steps: All the files with the suffix of .mega were concatenated together to make one file containing the 4,706 mitochondrial genomes generated by EGP across all the datasets. The sequences then underwent multi-sequence alignment using Clustal in MEGA version 11 (software available at https://www.megasoftware.net/). Pairwise distances were calculated using MEGA version 11. The table was exported from MEGA, and a custom python script was used to convert the MEGA format to a pairwise table. The pairwise table is available here as a .csv file. Please note the file is quite large with 11,070,866 lines. This is because there is a header line, and we are expecting (4706 individuals * 4706 individuals)-4706 = 11,070,865 pairwise values due to the absence of a comparison of self-self. The md5 for the data is: MD5 (pairwise_alignments_EGP_09282024.tar.bz2) = 520d7510a076219082d7821b2ecbbb38
创建时间:
2024-09-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作