Col-Can Genome Data

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://www.ncbi.nlm.nih.gov/sra/ERP161680

下载链接

链接失效反馈

官方服务：

资源简介：

The variation in expression across the proteome and transcriptome from a single plant tissue varies by over four orders of magnitude, despite most nuclear genes being encoded only once. The generally accepted explanation is that the observed expression of a given transcript reflects a specific balance between the rates of transcription and degradation, and that the expression of the corresponding protein is further modulated by a similar balance between translation and turnover. We set out to understand how much of variation of these omic levels within a single tissue can be explained by simple genomic features when genetic and environmental variation is absent. Specifically we investigated the impact of gene-body CpG methylation (gbM) and codon composition on these higher omic levels. We performed detailed genomic profiling of the two Arabidopsis thaliana accessions Col-0 and Can-0. We re-assembled their genomes using long reads, from which we also calculated constitutive gbM for each gene, and measured rosette gene and protein expression in multiple biological replicates grown under tightly controlled unstressed conditions, and which produced highly reproducible expression levels. We re-annotated the gene and transposon content of each assembled genome to produce accurate data across omic levels. We fitted models to explain variation in methylation, gene and protein expression in terms of lower omic levels. We find that the single best predictor of any omic level in one accession is to measure the corresponding level in the other accession, despite their 0.5% sequence divergence, and the presence of many differentially expressed genes. Looking within an accession, the impact of constitutive gbM on either mRNA or protein expression can be almost entirely explained by variation in that gene's codon composition. Thus, absent environmental perturbation, gbM is determined by local sequence features. We also find very similar impacts of any given codon on both mRNA and protein composition. These impacts are highly significant and unrelated to genome-wide codon abundance. About 44% of the variation in protein expression is explained by a combination of mRNA and sequence composition, each contributing distinct information. Codon composition alone explains 27% of protein expression and 44% of transcript variation. These statistics suggest that these simple models are important, but by no means the only, drivers of expression variation. We also measured tRNA abundance in both accessions and found expression levels were highly conserved between accessions, and that the aggregated expression of tRNAs encoding a given amino acid was strongly correlated with the abundance fraction of that amino acid among expressed genes.

创建时间：

2024-12-03

5,000+

优质数据集

54 个

任务类型

进入经典数据集