Inferring whole-genome histories in large population datasets: inferred tree sequences for Simons Genome Diversity Project
收藏Mendeley Data2024-03-27 更新2024-06-27 收录
下载链接:
https://zenodo.org/record/3052359
下载链接
链接失效反馈官方服务:
资源简介:
Tree sequences inferred for the SGDP autosomes using tsinfer version 0.1.4 and compressed using tszip. Tree sequences can be decompressed as follows: $ tsunzip sgdp_chr1.trees.tsz Once decompressed, trees files can be loaded and processed using tskit. import tskit
ts = tskit.load("sgdp_chr1.trees")
# ts is an instance of tskit.TreeSequence
print("Chromosome 1 contains {} trees".format(ts.num_trees)) Metadata associated with individuals and populations was derived from the original source and converted to JSON form. For example, to access individual metadata we can use: import tskit
import json
ts = tskit.load("sgdp_chr1.trees")
ind = ts.individual(0)
metadata_dict = json.loads(ind.metadata) The metadata_dict variable will now contain all the metadata for the individual with ID 0 as a dictionary. Metadata associated with populations can be found in a similar way. Population IDs are associated with individuals via their constituent nodes. For example, pop_metadata = [json.loads(pop.metadata) for pop in ts.populations()]
ind_node = ts.node(ind.nodes[0])
ind_pop_metadata = pop_metadata[ind_node.population] After this, the ind_pop_metadata variable will contain the population level metadata for individual ID 0. The full data pipeline used to generate these tree sequences and associated metadata is available on GitHub.
创建时间:
2023-06-28



