five

slaf-project/Parse-10M

收藏
Hugging Face2026-01-30 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/slaf-project/Parse-10M
下载链接
链接失效反馈
官方服务:
资源简介:
--- viewer: true license: cc-by-nc-sa-4.0 configs: - config_name: cells data_dir: "cells.lance" - config_name: expression data_dir: "expression.lance" - config_name: genes data_dir: "genes.lance" language: - en tags: - biology - genomics - PBMC - RNA - single-cell - lance - slaf pretty_name: Parse-10M --- # Parse 10M PBMC Dataset (SLAF Format) ## Attribution **This is a re-release of data originally generated by [Parse Biosciences](https://www.parsebiosciences.com/).** - **Original Dataset**: Parse 10M PBMC 12donor 90cytokines dataset - **Original Format**: H5AD file - **Original Source**: https://www.parsebiosciences.com/datasets/10-million-human-pbmcs-in-a-single-experiment/ - **This Release**: Same data in SLAF (Sparse Lazy Array Format) for SLAF tool compatibility - **License**: CC-BY-NC-4.0 (Creative Commons Attribution-NonCommercial 4.0) For detailed information about the dataset and methodology, please refer to the original source. ## About This Release This release provides the Parse 10M PBMC dataset in SLAF format, enabling direct use with SLAF tools and libraries. The data is identical to the original release, just in a different storage format. ## Dataset Description Parse 10M PBMC is a single-cell RNA sequencing dataset containing 10 million peripheral blood mononuclear cells (PBMCs) from 12 donors across 90 cytokine conditions. This release provides the same data in SLAF format for compatibility with SLAF tools. ## Usage This dataset is in [SLAF (Sparse Lazy Array Format)](https://slaf-project.github.io/slaf/) format, which uses the [Lance](https://lance.org/) table format for storage. You can use it with Hugging Face Datasets (for Parquet access), the `slaf` library (for SLAF format), or `pylance` library (for direct Lance access). ### Using SLAF (Recommended for SLAF Format) ```bash pip install slafdb ``` ```python hf_path = 'hf://datasets/slaf-project/Parse-10M' from slaf import SLAFArray slaf_array = SLAFArray(hf_path) slaf_array.query("SELECT * FROM cells LIMIT 5") ``` ### Using Lance Directly ```bash pip install pylance ``` ```python import lance hf_path = 'hf://datasets/slaf-project/Parse-10M' ds = lance.dataset(f"{hf_path}/cells.lance") ds.sample(10) ``` ## Dataset Structure The dataset contains single-cell RNA sequencing data from 10 million PBMC (Peripheral Blood Mononuclear Cell) samples across 12 donors with 90 cytokine conditions. For more detailed information about the dataset structure and metadata, please refer to the original source documentation. ## Citation If you use this dataset, please cite the original Parse Biosciences dataset and this re-release: ```bibtex @dataset{parse_10m_pbmc_2024, title={Parse 10M PBMC 12donor 90cytokines Dataset}, author={Parse Biosciences}, year={2024}, url=https://www.parsebiosciences.com/datasets/10-million-human-pbmcs-in-a-single-experiment/ } ```
提供机构:
slaf-project
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作