five

Data for the Populus tremula v2.2. genome project and associated genome-wide association study

收藏
figshare.scilifelab.se2024-08-22 更新2025-01-21 收录
下载链接:
https://figshare.scilifelab.se/articles/dataset/Data_for_the_i_Populus_tremula_i_v2_2_genome_project_and_associated_genome-wide_association_study/25335448/2
下载链接
链接失效反馈
官方服务:
资源简介:
Background to the study: We have produced a chromosome-scale genome assembly generated using long-read sequencing, optical and high-density genetic maps containing 39,894 annotated genes with functional annotations for 73,765 transcripts in 37,184 genes. We conducted whole-genome resequencing of the Umeå Aspen (UmAsp) collection comprising 227 aspen individuals. We utilised the assembly and existing whole genome re-sequencing data to perform genome-wide association analyses (GWAS) using Single Nucleotide Polymorphisms (SNPs) in the UmAsp, Swedish Aspen (SwAsp) and Scottish Aspen (ScotAsp) collections for leaf physiognomy phenotypes. We conducted Assay of Transposase Accessible Chromatin sequencing (ATAC-Seq) and identified genomic regions of accessible chromatin, and subset SNPs to these regions, which improved the GWAS detection rate. We identified candidate long non-coding RNAs in leaf samples and quantified their expression in an updated co-expression network, which we used to explore the functions of candidate genes identified from the GWAS.This data set comprises: the ATAC-Seq peaks from the ATAC-Sequencing of aspen leaves, 'Aspen leaf ATAC_Seq peaks.zip'; and the gene expression matrix of mean values per aspen genotype from the SwAsp collection, 'Gene_Expression_matrix_genotype_mean.tsv'. We provide a zipped directory for each of 'ScotAsp.zip', 'SwAsp.zip' and 'UmAsp.zip' providing the raw leaf image scans, the cropped leaf images and raw data files from the LAMINA leaf shape analyses of these images, and the processed data files and genotypic BLUP values for each of these ScotAsp, SwAsp and UmAsp collections. We provide the GWAS associations of SNPs ranked by decreasing P-value until the 1000th gene for each of the 26 leaf physiognomy traits for each collection, i.e. 'ScotAsp top-ranked GWAS results', 'SwAsp top-ranked GWAS results' and 'UmAsp top-ranked GWAS results'. The single nucleotide polymorphism (SNP) data for each of the aspen collections is in 'ScotAsp_biallelic_Het.HWE.recode.vcf.gz', 'SwAsp_AfterBatchRemoval_biallelic_Het.HWE.recode.vcf.gz' and 'UmAsp_biallelic_Het.MAF.HWE.recode_.vcf.gz'.

研究背景:本研究通过长读长测序、光学和高清遗传图谱构建,成功构建了一个染色体级基因组组装,其中包含39,894个已注释基因,并对37,184个基因中的73,765个转录本进行了功能注释。我们对乌梅阿斯本(Umeå Aspen,简称UmAsp)系列中的227个个体进行了全基因组重测序。利用所构建的组装及现有的全基因组重测序数据,我们对UmAsp、瑞典斯本(SwAsp)和苏格兰斯本(ScotAsp)系列中的单核苷酸多态性(SNPs)进行了全基因组关联分析(GWAS),以探究叶片形态特征表型。我们进行了转座酶可及染色质测序(ATAC-Seq)并识别了可及染色质区域,并对这些区域进行了SNPs子集,从而提高了GWAS的检测率。我们从叶片样本中鉴定了候选长非编码RNA(lncRNAs),并在更新后的共表达网络中量化了它们的表达水平,以此探究由GWAS鉴定出的候选基因的功能。本数据集包括:来自 Aspen 叶片的 ATAC-Seq 顶峰数据 'Aspen leaf ATAC_Seq peaks.zip';以及来自 SwAsp 系列中每个 Aspen 基因型的平均基因表达矩阵 'Gene_Expression_matrix_genotype_mean.tsv'。我们为 'ScotAsp.zip'、'SwAsp.zip' 和 'UmAsp.zip' 提供了压缩目录,其中包含叶片图像扫描、裁剪后的叶片图像和这些图像的 LAMINA 叶片形状分析的原始数据文件,以及 ScotAsp、SwAsp 和 UmAsp 系列的处理数据文件和基因型最佳线性无偏估计(BLUP)值。对于每个系列中的26种叶片形态特征表型,我们提供了按P值递减顺序排列的SNPs的GWAS关联结果,直至第1000个基因,即 'ScotAsp 高排名 GWAS 结果'、'SwAsp 高排名 GWAS 结果' 和 'UmAsp 高排名 GWAS 结果'。每个 Aspen 系列的SNP数据存储在 'ScotAsp_biallelic_Het.HWE.recode.vcf.gz'、'SwAsp_AfterBatchRemoval_biallelic_Het.HWE.recode.vcf.gz' 和 'UmAsp_biallelic_Het.MAF.HWE.recode_.vcf.gz' 中。
提供机构:
SciLifeLab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作