Dataset related to "Heterogeneous and higher-order cortical connectivity undergirds efficient, robust and reliable neural codes"

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/10812496

下载链接

链接失效反馈

官方服务：

资源简介：

This is an accompanying dataset to the article with the title "Heterogeneous and higher-order cortical connectivity undergirds efficient, robust and reliable neural codes" (DOI: 10.1101/2024.03.15.585196). It contains structural and activity data related to the morphologically detailed model of the rat somatosensory cortex (Markram et al., 2015), refered to as "BBP" data in the article. Specificaly, following data items are included: Simulation data: simulation.xz "Reliability" protocol: Separate folders BlobStimReliability_O1v5-SONATA_ with simulation data using the baseline and all manipulated connectomes respectively (see Technical info below), each of which containing: working_dir/connectome.h5: Connectivity matrix in ConnectivityMatrix format, which can be loaded using ConnectomeUtilities. working_dir/raw_spikes_exc_.npy: Raw (excitatory) spikes in numpy .npy format, containing an array of spike times (first column) and corresponding neuron GIDs (second column). One file for each of the 10 simulations with different simulator seeds, i.e., is 0..9. working_dir/stim_stream.npy: Stimulus train in numpy .npy format, containing the sequence of stimulus identities. working_dir/time_windows.npy: Time windows in ms in numpy .npy format, corresponding to the stimulus train. working_dir/processed_data_store.h5: Data store in HDF5 format with preprocessed spike signals (e.g., required for Gaussian kernel reliability computations), which contains... spike_signals_exc: Group of simulations datasets "sim_0" to "sim_9", each of which is an array of size <#gids x #t_bins> and contains binned spike signals filtered with a Gaussian kernel. sigma: Sigma in ms of Gaussian kernel used for smoothing. gids: List of excitatory neuron GIDs. t_bins: List of time bins in ms. firing_rates: Average firing rates per simulation (0..9; rows) and (excitatory) neuron GID (columns); average firing rates were computed as the inverse of the mean inter-spike interval per neuron. "Classification" protocol: Single folder Toposample_O1v5-SONATA with simulation data using the baseline connectome, stored in a format compatible with the TriDy (Conceição et al., 2022) and TopoSampling (Reimann et al., 2022) pipelines, containing: toposample_input/connectivity.npz: Sparse connectivity matrix in Compressed Sparse Column format , which can be loaded using scipy.sparse.load_npz. toposample_input/neuron_info.pickle: Pandas dataframe in pickle format, which can be loaded using pandas.read_pickle, containing additional information about each neuron. toposample_input/raw_spikes_exc.npy: Raw (excitatory) spikes in numpy .npy format, containing an array of spike times (first column) and corresponding neuron GIDs (second column). toposample_input/stim_stream.npy: Stimulus train in numpy .npy format, containing the sequence of stimulus identities. toposample_input/time_windows.npy: Time windows in ms in numpy .npy format, corresponding to the stimulus train. Classification data: classification.xz Selected neighborhoods and classification results for the "PCA" method (TopoSampling pipeline) as well as the "network_based" method using active subnetworks (TriDy pipeline, using TriDy-tools wrapper), stored as: "PCA" method: PCA/community_database_PCA.pkl: Pandas dataframe in pickle format, containing the binary selection of 50 neighborhood centers (neuron GIDs; rows) for the different selection parameters (columns). PCA/classification_results_PCA.pkl: Pandas dataframe in pickle format, containing the classification accuracies for all selection parameters (rows) and 6 cross-validation folds plus mean (columns). "network_based" method: network_based/selections_reliability.pkl: Pandas dataframe in pickle format, containing different combinations of first/second selection parameters for the double selection procedure (rows) together with neuron indices (w.r.t. the EXC subcircuit) of the corresponding 50 neighborhood centers (chief0..49; columns). network_based/partition_reliability.npy: Partition indices in numpy .npy format required to launch the pipeline using TriDy-tools, which is an array of 50 neuron indices (w.r.t. the full circuit!) of the neighborhood centers (columns) for each combination of selections as in selections_reliability.pkl (rows). network_based/results/...: Subfolder containing a list of pickle files with the classification results using different featurization parameters, as indicated by the filename. Each file contains a Pandas dataframe with the classification accuracies and numbers of (non-zero) features (columns) for each combination of selections as in selections_reliability.pkl (rows). Funding Funding provided by the Swiss government’s ETH Board to the Blue Brain Project, a research center of the École polytechnique fédérale de Lausanne (EPFL).

本数据集为论文《Heterogeneous and higher-order cortical connectivity undergirds efficient, robust and reliable neural codes》（DOI: 10.1101/2024.03.15.585196）的配套附属数据集。数据集包含与大鼠躯体感觉皮层形态学细节模型（Markram等，2015）相关的结构与活动数据，论文中将其称为"BBP"数据。具体包含以下数据项： ### 仿真数据：simulation.xz #### "可靠性（Reliability）"协议包含两个独立文件夹BlobStimReliability_O1v5-SONATA_，分别存储使用基线连接组与所有修改后连接组的仿真数据（详见下文技术说明），每个文件夹内包含以下文件： - working_dir/connectome.h5：ConnectivityMatrix格式的连接矩阵，可通过ConnectomeUtilities工具加载。 - working_dir/raw_spikes_exc_.npy：Numpy .npy格式的原始兴奋性脉冲数据，包含脉冲时间数组（第一列）与对应神经元全局标识符（Global Identifier, GID，第二列）。针对10组使用不同仿真种子的实验各生成一个文件，索引范围为0~9。 - working_dir/stim_stream.npy：Numpy .npy格式的刺激序列文件，存储刺激身份的完整序列。 - working_dir/time_windows.npy：Numpy .npy格式的时间窗文件（单位：ms），与刺激序列一一对应。 - working_dir/processed_data_store.h5：HDF5格式的预处理数据存储文件，包含用于高斯核可靠性计算的预处理脉冲信号，其中包含： - spike_signals_exc：10组仿真数据集"sim_0"至"sim_9"的集合，每组为<神经元GID数量 × 时间bin数量>大小的数组，存储经高斯核平滑后的分箱脉冲信号。 - sigma：用于信号平滑的高斯核的σ值（单位：ms）。 - gids：兴奋性神经元GID列表。 - t_bins：以ms为单位的时间bin列表。 - firing_rates：每组仿真（行，索引0~9）与每个兴奋性神经元GID（列）对应的平均放电率；平均放电率通过计算每个神经元的平均峰间期倒数得到。 #### "分类（Classification）"协议包含单个文件夹Toposample_O1v5-SONATA_，存储使用基线连接组的仿真数据，格式兼容TriDy（Conceição等，2022）与TopoSampling（Reimann等，2022）分析流程，文件夹内包含以下内容： - toposample_input/connectivity.npz：压缩稀疏列（Compressed Sparse Column, CSC）格式的稀疏连接矩阵，可通过scipy.sparse.load_npz工具加载。 - toposample_input/neuron_info.pickle：Pickle格式的Pandas数据框，可通过pandas.read_pickle工具加载，存储每个神经元的额外信息。 - toposample_input/raw_spikes_exc.npy：Numpy .npy格式的原始兴奋性脉冲数据，包含脉冲时间数组（第一列）与对应神经元GID（第二列）。 - toposample_input/stim_stream.npy：Numpy .npy格式的刺激序列文件，存储刺激身份的完整序列。 - toposample_input/time_windows.npy：Numpy .npy格式的时间窗文件（单位：ms），与刺激序列一一对应。 ### 分类数据：classification.xz 包含"PCA"方法（TopoSampling流程）与基于活动子网络的"network_based"方法（TriDy流程，使用TriDy-tools封装）的选定邻域与分类结果，存储格式如下： #### "PCA"方法 - PCA/community_database_PCA.pkl：Pickle格式的Pandas数据框，存储针对不同选择参数（列）的50个邻域中心（神经元GID；行）的二值选择结果。 - PCA/classification_results_PCA.pkl：Pickle格式的Pandas数据框，存储所有选择参数（行）对应的分类准确率，以及6折交叉验证结果与均值（列）。 #### "network_based"方法 - network_based/selections_reliability.pkl：Pickle格式的Pandas数据框，存储双重选择流程中不同的第一/第二选择参数组合（行），以及对应50个邻域中心（chief0~49；列）的神经元索引（相对于兴奋性子回路）。 - network_based/partition_reliability.npy：Numpy .npy格式的分区索引文件，用于通过TriDy-tools启动分析流程，其数组为针对selections_reliability.pkl中每个选择组合（行）的50个邻域中心的神经元索引（相对于完整回路；列）。 - network_based/results/...：子文件夹，包含多个Pickle文件，存储使用不同特征化参数的分类结果，文件名可指示参数设置。每个文件包含Pandas数据框，对应selections_reliability.pkl中每个选择组合（行）的分类准确率与非零特征数量（列）。 ### 资助信息本研究由瑞士联邦理工委员会（ETH Board）资助，资助对象为洛桑联邦理工学院（École polytechnique fédérale de Lausanne, EPFL）下属的蓝脑计划（Blue Brain Project）。

创建时间：

2024-11-25