five

1000 Genomes phase 3, phased and annotated data for use in plink2.0 worked examples

收藏
DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100516
下载链接
链接失效反馈
官方服务:
资源简介:
Lightly processed 1000 Genomes phase 3 dataset, directly derived from http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ . We have corrected a few errors in the official pedigree using the method described in Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM (2010) Robust relationship inference in genome-wide association studies, however the most notable property of these data are their size: they are ~78% smaller without throwing away any relevant information. This has been achieved by using the latest pre-release version of PLINK (plink2), which can now act as a new form of genomic compression technology. The manuscript describing the new PLINK release is currently in preparation, these data are being made available pre-publication due to demand. <br>Please refer to the 1000 Genomes website for additional sample information, data usage rules,, and citation instructions. <br>

本数据集为经过轻度处理的千人基因组计划(1000 Genomes)第三阶段数据集,直接源自官方源地址:http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/。我们采用Manichaikul A、Mychaleckyj JC、Rich SS、Daly K、Sale M、Chen WM于2010年发表的《全基因组关联研究中的稳健亲缘关系推断》一文中提出的方法,修正了官方家系文件中的若干错误。本数据集最显著的特性在于其体量:在不丢弃任何有效信息的前提下,数据量缩减了约78%。该压缩效果通过使用最新预发布版本的PLINK(plink2)实现,该工具现已可作为一种新型基因组压缩技术。描述新版PLINK的研究论文目前正在撰写中,鉴于用户需求,本数据集于正式出版前先行开放获取。 如需获取更多样本信息、数据使用规范及引用说明,请参阅千人基因组计划官方网站。
提供机构:
GigaScience Database
创建时间:
2018-10-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作