Human Genomic Data for Complex Traits.

Name: Human Genomic Data for Complex Traits.
Creator: The Pawsey Supercomputing Centre
Published: 2022-05-11 00:29:39
License: 暂无描述

DataCite Commons2022-05-11 更新2024-07-13 收录

下载链接：

https://apply.pawsey.org.au/p/AX426

下载链接

链接失效反馈

官方服务：

资源简介：

These data comprise next-generation sequence data, including whole-genome sequence data, exome data, and other genomic data files. The files range in size from 10GB for whole exome data file to up to 200GB for a whole genome file. These data represent genetic data that are analysed by Centre for Genetic Origins of Health and Disease Staff for both national and international collaborators. The collection contains or will contain genetic information for several nationally and internationally recognized studies that are based in Western Australia, including Busselton Health Study, Western Australian Pregnancy Study (Raine), and Western Australia Family Study of Schizophrenia as well as several others. The major type of research that these genomic data support is health related and seeks to identify genes related to complex disease (heart disease, cancer, etc.) and their associated risk factors (blood pressure, lipids, etc.). The major national and international studies we are currently working with include the 1) Busselton Health Study to identify genes associated with heart disease, cancer, etc; 2) the Raine study to identify genes associated with the development of complex diseases in an adolescent population; 3) Western Australia Family Study of Schizophrenia to identify genes and associated risk factors with this psychiatric disorder; 4) a study of Australian families with preeclampsia; 5) The Strong Heart Family Study, a genetic study of American Indians in the United States; and 6) The San Antonio Family Study, a genetic study of Mexican Americans in the United States. We currently analyse raw genetic data on Epic, using makefiles to call a number of routines to align, calibrate, and call genetic variants using supercomputing technology. The current runtime using up to 90 processors concurrently on Epic is 3 to 6 hours for a whole exome sequence and 15- 18 hours for a whole genome sequence. We currently run these in batches of up to 50 samples for exome and 5 samples for whole genome sequences. Based on current and pending funding initiatives we anticipate analysing 200 exome samples and up to 200 whole genome samples within the next 12 months. The data form the basis for a number of future genetic studies and represent the raw data for the identification of genetic risk factors in complex disease..

提供机构：

The Pawsey Supercomputing Centre

创建时间：

2022-05-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集