slaf-project/X-Atlas-Orion
收藏Hugging Face2026-01-30 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/slaf-project/X-Atlas-Orion
下载链接
链接失效反馈官方服务:
资源简介:
---
viewer: true
license: cc-by-nc-sa-4.0
configs:
- config_name: HEK293T-cells
data_dir: "data/HEK293T/cells.lance"
- config_name: HEK293T-expression
data_dir: "data/HEK293T/expression.lance"
- config_name: HEK293T-genes
data_dir: "data/HEK293T/genes.lance"
- config_name: HCT116-cells
data_dir: "data/HCT116/cells.lance"
- config_name: HCT116-expression
data_dir: "data/HCT116/expression.lance"
- config_name: HCT116-genes
data_dir: "data/HCT116/genes.lance"
language:
- en
tags:
- biology
- genomics
- RNA
- single-cell
- lance
- slaf
pretty_name: X-Atlas-Orion
---
# X-Atlas Orion Dataset (SLAF Format)
## Attribution
**This is a re-release of data originally generated by [Xaira Therapeutics](https://huggingface.co/Xaira-Therapeutics).**
- **Original Dataset**: [Xaira-Therapeutics/X-Atlas-Orion](https://huggingface.co/datasets/Xaira-Therapeutics/X-Atlas-Orion)
- **Original Format**: Parquet files
- **This Release**: Same data in SLAF (Sparse Lazy Array Format)
- **License**: CC-BY-NC-SA-4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0)
- **Original Citation**:
```
@article{huang2025xatlasorion,
title={X-Atlas/Orion: Genome-wide Perturb-seq Datasets via a Scalable Fix-Cryopreserve Platform for Training Dose-Dependent Biological Foundation Models},
author={Huang, Ann C and Hsieh, Tsung-Han S and Zhu, Jiang and Michuda, Jackson and Teng, Ashton and Kim, Soohong and Rumsey, Elizabeth M and Lam, Sharon K and Anigbogu, Ikenna and Wright, Philip and Ameen, Mohamed and You, Kwontae and Graves, Christopher J and Kim, Hyunsung John and Litterman, Adam J and Sit, Rene V and Blocker, Alex and Chu, Ci},
journal={bioRxiv},
year={2025},
url={https://www.biorxiv.org/content/10.1101/2025.06.11.659105v1}
}
```
For detailed information about the dataset, methodology, and original publication, please refer to the [original dataset repository](https://huggingface.co/datasets/Xaira-Therapeutics/X-Atlas-Orion).
## Dataset Description
X-Atlas/Orion is a Perturb-seq atlas containing two genome-wide Fix-Cryopreserve-ScRNAseq (FiCS) Perturb-seq screens that target all human protein-coding genes (n = 18,903 genes). The dataset is comprised of eight million HCT116 and HEK293T cells, each deeply sequenced to a median of 16,000 unique molecular identifiers (UMIs) per cell. The median on-target knockdown efficiency is 75.4% in HCT116 cells and 51.5% in HEK293T cells, with a median of at least 140 cells per perturbation. This release provides the same data in SLAF format for compatibility with SLAF tools. For more detailed information, see the [original dataset repository](https://huggingface.co/datasets/Xaira-Therapeutics/X-Atlas-Orion).
## Usage
This dataset is in [SLAF (Sparse Lazy Array Format)](https://slaf-project.github.io/slaf/) format, which uses the [Lance](https://lance.org/) table format for storage.
You can use it with the `slafdb` library (for SLAF format), or `pylance` library (for direct Lance access).
### Using SLAF (Recommended for SLAF Format)
```bash
pip install slafdb
```
```python
hf_path = 'hf://datasets/slaf-project/X-Atlas-Orion'
from slaf import SLAFArray
slaf_hct116 = SLAFArray(f"{hf_path}/data/HCT116")
slaf_hct116.query("SELECT * FROM cells LIMIT 10")
```
### Using Lance Directly
```bash
pip install pylance
```
```python
hf_path = 'hf://datasets/slaf-project/X-Atlas-Orion'
import lance
lance_hct116_ds = lance.dataset(f"{hf_path}/data/HCT116/cells.lance")
lance_hct116_ds.sample(10)
```
提供机构:
slaf-project



