five

UnitRefine - Curated datasets

收藏
DataCite Commons2026-02-17 更新2026-02-09 收录
下载链接:
https://figshare.com/articles/dataset/Curated_dataset/28282799
下载链接
链接失效反馈
官方服务:
资源简介:
Curated electrophysiological datasets associated with the manuscript <b>“UnitRefine: A Community Toolbox for Automated Spike Sorting Curation”</b> by Jain et al. (bioRxiv: https://www.biorxiv.org/content/10.1101/2025.03.30.645770v2.full).This repository contains datasets used for training and evaluation of the UnitRefine models described in the manuscript. For methodological details, cluster quality metrics, and model specifications, please refer to the manuscript, the UnitRefine GitHub repository (https://github.com/anoushkajain/UnitRefine), and the Hugging Face model repository (https://huggingface.co/AnoushkaJain3).The file <b>curated_base_dataset.zip</b> contains a manually curated dataset across eleven recordings from 5 mice. Each recording was independently annotated by two to five human experts. Recordings from <i>imec0</i> were performed in primary visual cortex, while recordings from <i>imec1</i> were performed in somatosensory cortex. The file <b>base_dataset.csv</b> contains the combined cluster metrics and consensus human labels used to train multiple UnitRefine models. Subfolders contain the corresponding Kilosort 2.5 spike-sorting outputs for each recording.The file <b>Allen_dataset.zip</b> contains a dataset labeled by two human curators from four recordings across three different mice. The file <b>Allen_dataset.csv</b> contains the combined cluster metrics and consensus human labels used to train the UnitRefine models. Subfolders contain the corresponding Kilosort 4 spike-sorting outputs for each recording, saved in <b>.zarr</b> format and generated using the Allen Institute ecephys spike-sorting pipeline (https://github.com/AllenInstitute/ecephys_spike_sorting).The file <b>IBL_dataset.zip</b> contains a dataset labeled by one human curator from eight recordings across three different mice. The file <b>IBL_dataset.csv</b> contains the combined cluster metrics and human labels used to train the UnitRefine models. Subfolders contain the corresponding spike-sorting outputs for each recording, saved as SpikeInterface sorting analyzer objects and generated using the IBL spike-sorting pipeline (PyKilosort 2.5). These data were also used in the study by Pan-Vazquez et al. (2025; bioRxiv: https://www.biorxiv.org/content/10.1101/2025.11.04.685995v1).The file <b>mole_rat_dataset.zip</b> contains a dataset labeled by a human curator from four recordings across four different mole rats. The file <b>mole_rat_dataset.csv</b> contains the combined cluster metrics and human labels used to train the UnitRefine models. Subfolders contain the corresponding Kilosort 4 spike-sorting outputs for each recording, saved as SpikeInterface sorting analyzer objects (https://spikeinterface.readthedocs.io/en/stable/tutorials/core/plot_4_sorting_analyzer.html). These data were also used in the study by Shirdhankar et al. (2025; bioRxiv: https://www.biorxiv.org/content/10.64898/2025.12.15.693140v1).The file <b>monkey_dataset.zip</b> contains a dataset labeled by a human curator from eleven recordings across two nonhuman primates (rhesus macaques). The file <b>monkey_dataset.csv</b> contains the combined cluster metrics and human labels used to train the UnitRefine models. Subfolders contain additional information for different recording arrays, sorted with Kilosort 4; the corresponding raw data are available as part of the dataset published by Chen et al. (2022; https://www.nature.com/articles/s41597-022-01180-1).The file <b>human_dataset.csv</b> contains the combined cluster metrics and human labels from twelve recordings obtained with Behnke–Fried electrodes in human patients. All recordings were spike-sorted using Combinato. These data were also used in the study by Gerken et al. (2025; https://elifesciences.org/reviewed-preprints/106758). Please contact the authors directly for access to the corresponding raw data.<br>
提供机构:
figshare
创建时间:
2025-01-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作