Inferring whole-genome histories in large population datasets: inferred tree sequences for Simons Genome Diversity Project

Mendeley Data2024-03-27 更新2024-06-27 收录

下载链接：

https://zenodo.org/record/3052359

下载链接

链接失效反馈

官方服务：

资源简介：

Tree sequences inferred for the SGDP autosomes using tsinfer version 0.1.4 and compressed using tszip. Tree sequences can be decompressed as follows: $ tsunzip sgdp_chr1.trees.tsz Once decompressed, trees files can be loaded and processed using tskit. import tskit ts = tskit.load("sgdp_chr1.trees") # ts is an instance of tskit.TreeSequence print("Chromosome 1 contains {} trees".format(ts.num_trees)) Metadata associated with individuals and populations was derived from the original source and converted to JSON form. For example, to access individual metadata we can use: import tskit import json ts = tskit.load("sgdp_chr1.trees") ind = ts.individual(0) metadata_dict = json.loads(ind.metadata) The metadata_dict variable will now contain all the metadata for the individual with ID 0 as a dictionary. Metadata associated with populations can be found in a similar way. Population IDs are associated with individuals via their constituent nodes. For example, pop_metadata = [json.loads(pop.metadata) for pop in ts.populations()] ind_node = ts.node(ind.nodes[0]) ind_pop_metadata = pop_metadata[ind_node.population] After this, the ind_pop_metadata variable will contain the population level metadata for individual ID 0. The full data pipeline used to generate these tree sequences and associated metadata is available on GitHub.

创建时间：

2023-06-28