five

UnitRefine - Curated datasets

收藏
Figshare2025-01-26 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Curated_dataset/28282799
下载链接
链接失效反馈
官方服务:
资源简介:
Curated electrophysiological datasets associated with the manuscript “UnitRefine: A Community Toolbox for Automated Spike Sorting Curation” by Jain et al. (bioRxiv: https://www.biorxiv.org/content/10.1101/2025.03.30.645770v2.full).This repository contains datasets used for training and evaluation of the UnitRefine models described in the manuscript. For methodological details, cluster quality metrics, and model specifications, please refer to the manuscript, the UnitRefine GitHub repository (https://github.com/anoushkajain/UnitRefine), and the Hugging Face model repository (https://huggingface.co/AnoushkaJain3).The file curated_base_dataset.zip contains a manually curated dataset across eleven recordings from 5 mice. Each recording was independently annotated by two to five human experts. Recordings from imec0 were performed in primary visual cortex, while recordings from imec1 were performed in somatosensory cortex. The file base_dataset.csv contains the combined cluster metrics and consensus human labels used to train multiple UnitRefine models. Subfolders contain the corresponding Kilosort 2.5 spike-sorting outputs for each recording.The file Allen_dataset.zip contains a dataset labeled by two human curators from four recordings across three different mice. The file Allen_dataset.csv contains the combined cluster metrics and consensus human labels used to train the UnitRefine models. Subfolders contain the corresponding Kilosort 4 spike-sorting outputs for each recording, saved in .zarr format and generated using the Allen Institute ecephys spike-sorting pipeline (https://github.com/AllenInstitute/ecephys_spike_sorting).The file IBL_dataset.zip contains a dataset labeled by one human curator from eight recordings across three different mice. The file IBL_dataset.csv contains the combined cluster metrics and human labels used to train the UnitRefine models. Subfolders contain the corresponding spike-sorting outputs for each recording, saved as SpikeInterface sorting analyzer objects and generated using the IBL spike-sorting pipeline (PyKilosort 2.5). These data were also used in the study by Pan-Vazquez et al. (2025; bioRxiv: https://www.biorxiv.org/content/10.1101/2025.11.04.685995v1).The file mole_rat_dataset.zip contains a dataset labeled by a human curator from four recordings across four different mole rats. The file mole_rat_dataset.csv contains the combined cluster metrics and human labels used to train the UnitRefine models. Subfolders contain the corresponding Kilosort 4 spike-sorting outputs for each recording, saved as SpikeInterface sorting analyzer objects (https://spikeinterface.readthedocs.io/en/stable/tutorials/core/plot_4_sorting_analyzer.html). These data were also used in the study by Shirdhankar et al. (2025; bioRxiv: https://www.biorxiv.org/content/10.64898/2025.12.15.693140v1).The file monkey_dataset.zip contains a dataset labeled by a human curator from eleven recordings across two nonhuman primates (rhesus macaques). The file monkey_dataset.csv contains the combined cluster metrics and human labels used to train the UnitRefine models. Subfolders contain additional information for different recording arrays, sorted with Kilosort 4; the corresponding raw data are available as part of the dataset published by Chen et al. (2022; https://www.nature.com/articles/s41597-022-01180-1).The file human_dataset.csv contains the combined cluster metrics and human labels from twelve recordings obtained with Behnke–Fried electrodes in human patients. All recordings were spike-sorted using Combinato. These data were also used in the study by Gerken et al. (2025; https://elifesciences.org/reviewed-preprints/106758). Please contact the authors directly for access to the corresponding raw data.
创建时间:
2025-01-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作