five

Inferring whole-genome histories in large population datasets: inferred tree sequences for Simons Genome Diversity Project

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3052358
下载链接
链接失效反馈
官方服务:
资源简介:
Tree sequences inferred for the SGDP autosomes using tsinfer version 0.1.4 and compressed using tszip. Tree sequences can  be decompressed as follows: $ tsunzip sgdp_chr1.trees.tsz Once decompressed, trees files can be loaded and processed using tskit.  import tskit ts = tskit.load("sgdp_chr1.trees") # ts is an instance of tskit.TreeSequence print("Chromosome 1 contains {} trees".format(ts.num_trees)) Metadata associated with individuals and populations was derived from the original source and converted to JSON form. For example, to access individual metadata we can use: import tskit import json ts = tskit.load("sgdp_chr1.trees") ind = ts.individual(0) metadata_dict = json.loads(ind.metadata) The metadata_dict variable will now contain all the metadata for the individual with ID 0 as a dictionary. Metadata associated with populations can be found in a similar way. Population IDs are associated with individuals via their constituent nodes. For example, pop_metadata = [json.loads(pop.metadata) for pop in ts.populations()] ind_node = ts.node(ind.nodes[0]) ind_pop_metadata = pop_metadata[ind_node.population] After this, the ind_pop_metadata variable will contain the population level metadata for individual ID 0. The full data pipeline used to generate these tree sequences and associated metadata is available on GitHub.
创建时间:
2020-01-24
二维码
社区交流群
二维码
科研交流群
商业服务