five

C. elegans connectome gene expression data

收藏
Figshare2017-10-21 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/C_elegans_connectome_gene_expression_data/5450149
下载链接
链接失效反馈
官方服务:
资源简介:
Data files required to run analysis of C. elegans connectivity and gene expression analysis:Arnatkevic̆iūtė, A.*, Fulcher, B. D.*, Pocock, R. & Fornito, A. Hub connectivity, neuronal diversity, and gene expression in the Caenorhabditis elegans connectome. PLoS Comp. Biol. 14, e1005989 (2018).Matlab/python code for processing these data files and reproducing our analyses is in the Github repository (link below).Note that all data was retrieved from publicly available sources. Please check with each relevant source (described in our paper) for citation/license requirements.anatomy_association.WS256.txt - gene expression annotations, where genes are assigned to individual neurons (direct assignment) or groups of neurons (indirect assignment). Data were downloaded from ftp://ftp.wormbase.org/pub/wormbase/releases/WS256/ONTOLOGY/anatomy_association.WS256.wbNeuronal gene expression is measured as a binary indicator on WormBase based on curated data collated from many individual experiments. Expression annotations are made either ‘directly’ to individual neurons (when an experiment indicates expression in an individual neuron), or ‘indirectly’ to broader classes of neurons like ‘interneuron’ or ‘head’ (meaning that some members of that class exhibit expression of that gene). In order to maintain specificity of annotations, we only analyzed `direct' annotations, and excluded annotations labeled as ‘uncertain’ (see manuscript for details).Celegans_positions.mat - positions for 279 neurons in 2D space accompanied with neuron names. Data were downloaded from http://www.biological-networks.org/?page_id=25. Coordinates for three neurons (AIBL, AIYL, SMDVL) were missing in this dataset, and were reconstructed by assigning identical coordinates to the corresponding contralateral neurons (AIBR, AIYR, SMDVR) according to http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1001044). celegans279_BT.mat - birth times in minutes for 279 neurons accompanied with neuron names. Downloaded from Dynamic Connectome Lab website (https://www.dynamic-connectome.org/?page_id=25)CelegansEntrezID.txt – a list of genes with corresponding entrezIDs and gene descriptions.hierarchy.csv – Anatomical hierarchy of neurons as defined in WormBase, which we retrieved using the WormBase API (using RetrieveHierarchy.py in the github repo).NeuronConnect.xls – neuronal connectivity data from Varshney et al. http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1001066 obtained from WormAtlas (www.wormatlas.org/neuronalwiring.html#NeuronalconnectivityII). It includes connectivity data for 279 neurons (282 somatic neurons, i.e., excluding CANL/R, and VC6, which do not form synapses with other neurons). NeuronConnect_corrected.csv – a copy of the NeuronConnect.xls original neuronal connectivity file with headings removed and 1872 line changed to uppercase to match other neuron names. This file is further used in generating the data in Matlab format. NeuronLineage_Part1.txt and NeuronLineage_Part2.txt – lineage distance for all pairs of neurons from previously published embryonic and post-embryonic lineage trees using data downloaded from WormAtlas (http://www.wormatlas.org/neuronalwiring.html#Lineageanalysis). In this dataset, the closest common ancestor neuron was identified for each pair of neurons, and then the lineage distance was calculated as the number of cell divisions from the closest common progenitor neuron.NeurotransmitterTypePereira.csv – neurotransmitter systems used by each neuron. Neurons were labeled by matching to Table 2 of Pereira et al., 2015 (https://www.ncbi.nlm.nih.gov/pubmed/26705699). preComputePositiveMatch.mat – pre-computed values for the positive match measure which is used to determine similarity in gene expression between pairs of neurons focusing only on instances where both neurons express a certain gene and ignoring instances where both neurons do not express it.

用于运行秀丽隐杆线虫(Caenorhabditis elegans, C. elegans)连接组(connectome)与基因表达分析所需的数据文件:Arnatkevičiūtė, A.*、Fulcher, B. D.*、Pocock, R. 与 Fornito, A. 发表于《PLOS计算生物学》(PLoS Comp. Biol.)2018年第14卷e1005989的论文《Hub connectivity, neuronal diversity, and gene expression in the Caenorhabditis elegans connectome》。用于处理这些数据文件并复现本研究分析的Matlab/Python代码已上传至GitHub仓库(链接见下文)。所有数据均取自公开数据源,请参阅论文中描述的相关来源以获取引用与使用许可要求。 anatomy_association.WS256.txt:基因表达注释文件,其中基因被分配至单个神经元(直接分配)或神经元群组(间接分配)。本数据下载自WormBase的FTP服务器:ftp://ftp.wormbase.org/pub/wormbase/releases/WS256/ONTOLOGY/anatomy_association.WS256。WormBase以二进制标记量化神经元基因表达水平,其数据整合了多项独立实验中经人工整理的结果。表达注释分为两类:直接注释(当实验明确显示基因在单个神经元中表达时)与间接注释(注释至更宽泛的神经元类别,如“中间神经元”或“头部神经元”,即该类别中部分成员表达该基因)。为保证注释的特异性,本研究仅分析直接注释,并排除标记为“不确定”的注释(详细说明参见论文)。 Celegans_positions.mat:包含279个神经元的二维空间位置与神经元名称的数据集,下载自生物网络官网:http://www.biological-networks.org/?page_id=25。该数据集缺失3个神经元(AIBL、AIYL、SMDVL)的坐标,我们根据参考文献(http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1001044),将其对侧神经元(AIBR、AIYR、SMDVR)的坐标赋值给缺失神经元以完成补全。 celegans279_BT.mat:包含279个神经元的出生时间(单位:分钟)与神经元名称的数据集,下载自动态连接组实验室官网:https://www.dynamic-connectome.org/?page_id=25。 CelegansEntrezID.txt:包含基因及其对应Entrez ID与基因描述的列表文件。 hierarchy.csv:基于WormBase定义的神经元解剖学层级结构,我们通过WormBase API(使用GitHub仓库中的RetrieveHierarchy.py脚本)获取该数据。 NeuronConnect.xls:取自Varshney等人的神经元连接组数据,来源为WormAtlas(www.wormatlas.org/neuronalwiring.html#NeuronalconnectivityII),对应研究论文:http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1001066。该数据集包含279个神经元的连接组数据(共282个体细胞神经元,排除不与其他神经元形成突触的CANL/R与VC6)。 NeuronConnect_corrected.csv:NeuronConnect.xls的副本,移除了表头并将第1872行的神经元名称改为大写以匹配其他条目,该文件用于后续生成Matlab格式数据。 NeuronLineage_Part1.txt和NeuronLineage_Part2.txt:包含所有神经元对的谱系距离(lineage distance)数据,基于已发表的胚胎发育与后胚胎发育谱系树,数据下载自WormAtlas(http://www.wormatlas.org/neuronalwiring.html#Lineageanalysis)。该数据集为每对神经元确定最近共同祖先神经元,随后以二者距最近共同祖细胞的细胞分裂次数之和作为谱系距离。 NeurotransmitterTypePereira.csv:记录每个神经元所使用的神经递质系统(neurotransmitter systems),通过匹配Pereira等人2015年的表2(https://www.ncbi.nlm.nih.gov/pubmed/26705699)完成神经元标注。 preComputePositiveMatch.mat:预计算的正匹配度量值(positive match measure),该指标用于评估神经元对间的基因表达相似性,仅关注两个神经元均表达某一基因的情况,忽略二者均不表达该基因的场景。
创建时间:
2017-10-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作