five

Exploring the genotype-to-phenotype map using quantifiable patterns in metazoan genomic and morphological data

收藏
DataONE2025-11-03 更新2025-11-08 收录
下载链接:
https://search.dataone.org/view/sha256:fb3a4edc291359901f2cf688159157bc3128e6ab06a433faabfeddc6f5509d68
下载链接
链接失效反馈
官方服务:
资源简介:
A prevailing problem in evolutionary biology is elucidating the “genotype-phenotype map” that characterizes how genomic activities regulate different aspects of organismal morphology and their variability in both space and time. Here, we explore potential causality between genome content and both morphological complexity and disparity by compiling the regulatory components (i.e., transcription factors, RNA binding proteins, and microRNA families) as well as a representative set of non-regulatory “housekeeping genes” in 32 species belonging to a wide variety of animal phyla, altogether encapsulating a number of varying morphological, ecologic and genomic characteristics. A principal component analysis of these four non-overlapping genomic components from each of these 32 species in relation to their last common ancestor revealed that no relationship exists between genome space and disparity, as changes to animal body plans appear to be largely the result of changes to the gene regulatory..., Materials and Methods Gene Complements The transcription factors for human were taken from Messina et al. (2004) and for fruit fly D. melanogaster these sequences were taken from FlyBase (https://flybase.org/reports/FBgg0000745.htm) (Supp. File 1; all supplemental files can be found at https://doi.org/10.5061/dryad.7h44j1050). The regulatory RNA-binding proteins for human were taken from Gerstberger et al. (2014) and for fruit fly from Gamberi et al. (2006) (Supp. File 2). Regulatory RNA-binding proteins were discerned from non-regulatory RNA-binding proteins by examining the “GO: Biological Function” for each gene and compiling those with clear regulatory roles in either co-transcriptional or post-transcriptional processing in either human or fruit fly (see Supp. File 2). Octopus transcription factors were annotated by searching the proteome with hmmscan program from HMMER v 3.3.2 with –cut_ga option against the list of transcription factor Pfam models (Mistry et al. 2020). DNA repai..., , # Exploring the genotype-to-phenotype map using quantifiable patterns in metazoan genomic and morphological data Peterson, K. J., A. W. Clarke, G. Zolotarov, B. Deline, T. D. Lamb, M. A. McPeek, P. Martinez, and B. Fromm. Access this dataset on Dryad: DOI:10.5061/dryad.7h44j1050 DATA & CODE OVERVIEW This data repository (SequencesFilesDataAndScripts.zip) consists of 78 FASTA files of gene sequences (in Sequence Directory); 15 summary data files in .csv format (in Files directory); and Matlab, R, .csv and .tre files used to perform final analyses (in DataAndScripts directory).  The contents of these files are described below. WORKFLOW Starting with the list of human genes as described in the Materials and Methods, each gene found in one of the other 31 taxa (sequences in FASTA files of Sequence Directory) was manually imputed in a spread sheet with “0” indicating an absence or the positive integer indicating the total number of genes found in the indicated species (.csv files in Fi...,

进化生物学中一个长期存在的核心科学问题,是阐明基因型-表型映射(genotype-phenotype map)——即基因组活动如何调控生物体形态的各个维度,及其在时空维度上的变异模式。本研究通过收集32个隶属于多个动物门类的物种的调控元件(即转录因子(transcription factors)、RNA结合蛋白(RNA binding proteins)以及microRNA家族(microRNA families)),以及一组具有代表性的非调控“持家基因(housekeeping genes)”,探究基因组组成与形态复杂度、形态分异度之间的潜在因果关系。这些物种涵盖了多样的形态、生态与基因组特征。针对这32个物种各自的四类非重叠基因组元件(相较于它们的最近共同祖先)进行主成分分析(principal component analysis)后发现,基因组空间与形态分异度之间并无关联,这表明动物躯体构型的演化,在很大程度上源于基因调控的改变…… ## 材料与方法 ### 基因组分 人类的转录因子数据取自Messina等人(2004)的研究,果蝇(黑腹果蝇*D. melanogaster*)的相关序列则来自FlyBase数据库(https://flybase.org/reports/FBgg0000745.htm)(补充文件1;所有补充文件均可通过https://doi.org/10.5061/dryad.7h44j1050获取)。人类的调控型RNA结合蛋白数据取自Gerstberger等人(2014)的研究,果蝇的相关数据则来自Gamberi等人(2006)的研究(补充文件2)。我们通过查询每个基因的“GO:生物学功能(GO: Biological Function)”注释,筛选出在人类或果蝇中具有明确共转录或转录后调控功能的蛋白,以此区分调控型与非调控型RNA结合蛋白(详见补充文件2)。章鱼的转录因子注释则通过使用HMMER v3.3.2软件的hmmscan程序,以“–cut_ga”参数比对转录因子Pfam模型列表完成(Mistry等人,2020)。DNA修复…… # 利用后生动物基因组与形态数据的可量化模式探究基因型-表型映射 作者:Peterson K. J.、Clarke A. W.、Zolotarov G.、Deline B.、Lamb T. D.、McPeek M. A.、Martinez P.、Fromm B. 本数据集可在Dryad数据库获取:DOI:10.5061/dryad.7h44j1050 ## 数据与代码概览 本数据仓库(SequencesFilesDataAndScripts.zip)包含:序列目录下的78个基因序列FASTA文件;文件目录下的15个.csv格式汇总数据文件;以及数据分析与脚本目录下的用于最终分析的Matlab、R、.csv与.tre格式文件。下文将对这些文件的内容进行说明。 ## 分析流程 以材料与方法中提及的人类基因列表为基准,我们在电子表格中手动标注了其余31个类群中发现的每个基因(序列文件位于Sequence目录下的FASTA文件中):其中“0”代表该基因在对应物种中缺失,正整数则代表该物种中检测到的该基因的总拷贝数(Files目录下的.csv文件……)
创建时间:
2025-11-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作