Oligonucleotide annotations from the Agilent Songbird Oligonucleotide Array V2: data file 6

Name: Oligonucleotide annotations from the Agilent Songbird Oligonucleotide Array V2: data file 6
Creator: figshare
Published: 2020-08-29 21:43:48
License: 暂无描述

DataCite Commons2020-08-29 更新2024-07-27 收录

下载链接：

https://springernature.figshare.com/articles/Oligonucleotide_annotations_from_the_Agilent_Songbird_Oligonucleotide_Array_V2_data_file_6/6189467

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset contains a single MS Excel .xlsx file, documenting the curation of a large set of oligonucleotide (oligo) annotations from the Agilent Songbird Oligonucleotide Array V2. Tables 3-13 provide a list of subsets of oligos that were curated at each step of the analysis, including subsets of oligos that were removed because they were deemed uninformative (Tables 3-6), or subjected to a name verification and/or reannotation effort (Tables 8-13). The step-by-step curation of these oligos is described in the manuscript associated with this dataset. Oligos in various categories are identified by 'oligo IDs' given by Duke University. Those subject to name verification and/or reannotation effort also have recorded the most recent oligo consensus symbols and names applied in two studies referenced in the related paper, Chromosomal and strand information to which the oligo aligns, and related HGNC ID Gene Description and HGNC Symbol Status, where relevant. BackgroundZebra finches are a major model organism for investigating mechanisms of vocal learning, a trait that enables spoken language in humans. The development of EST/cDNA database collections and microarrays has allowed extensive molecular characterizations of the vocal learning and production circuitry. However, poor curation of these databases can lead to errors in transcriptome and bioinformatics analyses, limiting the impact of these resources. Here we used genomic alignments and synteny analysis for orthology verification to curate and reannotate a large set of oligonucleotides and corresponding ESTs/cDNAs that make-up Agilent microarrays for gene expression analysis in finches.

本数据集包含一份微软Excel（MS Excel）.xlsx格式文件，记录了安捷伦Songbird寡核苷酸芯片V2中大量寡核苷酸（oligonucleotide，简称oligo）注释信息的筛选质控流程。表3至表13列出了分析各阶段完成筛选的寡核苷酸子集，其中包括因被判定为无信息而被剔除的寡核苷酸子集（表3至表6），以及经过名称验证和/或重新注释流程的寡核苷酸子集（表8至表13）。本数据集配套的学术手稿详细阐述了这些寡核苷酸的分步筛选质控过程。不同类别的寡核苷酸由杜克大学提供的“寡核苷酸编号（oligo ID）”进行标识。经过名称验证和/或重新注释流程的寡核苷酸，还附带了相关论文引用的两项研究中使用的最新寡核苷酸共识符号与名称、寡核苷酸比对所对应的染色体及链信息，以及相关的人类基因命名委员会（HGNC）编号、基因描述与HGNC符号状态（如适用）。背景：斑胸草雀是研究声音学习机制的重要模式生物，而声音学习是人类口语语言形成的关键性状。表达序列标签（Expressed Sequence Tag，简称EST）/互补DNA（complementary DNA，简称cDNA）数据库集与基因芯片技术的发展，使得研究人员能够对声音学习与发声回路开展广泛的分子表征研究。然而，这些数据库的筛选质控不规范会导致转录组学与生物信息学分析出现误差，限制了此类资源的应用价值。本研究通过基因组比对与共线性分析进行直系同源验证，对构建安捷伦雀类基因表达分析芯片的大量寡核苷酸及其对应的EST/cDNA完成了筛选质控与重新注释。

提供机构：

figshare

创建时间：

2018-04-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集