five

Agilent Barley Gene Expression Microarray Reannotation

收藏
NIAID Data Ecosystem2026-03-08 收录
下载链接:
https://figshare.com/articles/dataset/Agilent_Barley_Gene_Expression_Microarray_Reannotation/987111
下载链接
链接失效反馈
官方服务:
资源简介:
An attempt was made to reannotate as many sequences as possible in the Agilent Barley Gene Expression Microarray, 4x44K, based on the UniProt plant protein collection and the Gramene rice genome annotation. After downloading the full sequences of the array oligos from the Core and EST sections of the NCBI Nucleotide Archive, the TIGR Plant Transcript Assembly and the DFCI Gene Indices collection, which were not provided with the array annotation, a search was made for possible homologs in the  2 February 2011 UniProt Knowledgebase Plant section and the Gramene rice genome assembly, version MSU6. A BLAST search was run with default parameters, the only modification being the use of the query sequence filtering for low complexity regions. After the BLAST search, the results were filtered, choosing the one with the best score, but requiring an e-value lower than 1e-30 and an overall fraction of iden- tical positions across the hit and query alignment of at least 50%. The description and GeneOntology (GO) associations were reannotated for all array oligos where a suitable homolog was found either in UniProt or in the Gramene database. The file contains the full sequence collection from which the array probes were derived, the Molecular function, Biological process and Cellular compartment Gene Ontology annotations in BiNGO (http://apps.cytoscape.org/apps/bingo) format and the annotation table with the UniProt and Gramene hits in a tab delimited file.

本研究基于UniProt植物蛋白质集(UniProt plant protein collection)与Gramene水稻基因组注释,对Agilent 4x44K大麦基因表达微阵列(Agilent Barley Gene Expression Microarray, 4x44K)中尽可能多的序列开展了重新注释工作。研究人员先从NCBI核苷酸档案(NCBI Nucleotide Archive)的Core与EST区段、未附带阵列注释的TIGR植物转录本组装集(TIGR Plant Transcript Assembly)以及DFCI基因索引集(DFCI Gene Indices collection)中下载阵列寡核苷酸的完整序列,随后在2011年2月2日发布的UniProt知识库植物分支(UniProt Knowledgebase Plant section)以及Gramene水稻基因组组装版本MSU6中检索潜在同源序列。 本次BLAST搜索采用默认参数,仅针对低复杂度区域的查询序列过滤这一环节进行了参数调整。搜索完成后对结果进行筛选:保留得分最优的匹配结果,同时需满足E值(e-value)低于1e-30,且匹配序列与查询序列的整体比对一致位点占比不低于50%。 对于在UniProt或Gramene数据库中找到合格同源序列的所有阵列寡核苷酸,研究人员对其序列描述及基因本体(GeneOntology, GO)关联信息进行了重新注释。 本文件包含构建阵列探针所用的完整序列集、采用BiNGO(http://apps.cytoscape.org/apps/bingo)格式的分子功能、生物过程及细胞组分基因本体注释,以及以制表符分隔的包含UniProt与Gramene匹配结果的注释表。
创建时间:
2014-04-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作