five

Supporting data for "The intersectional genetics landscape for human"

收藏
DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100765
下载链接
链接失效反馈
官方服务:
资源简介:
The human body is made up of hundreds, perhaps thousands of cell types and states, most of which are currently inaccessible genetically. Genetic accessibility carries significant diagnostic and therapeutic potential by allowing the selective delivery of genetic messages or cures to cells. Research in model organisms has shown that single regulatory element (RE) activities are seldom cell-type specific, limiting their usage in genetic systems designed to restrict gene expression posteriorly to their delivery to cells. Intersectional genetic approaches can theoretically increase the number of genetically accessible cells, but the scope and safety of these approaches to humans have not been systematically assessed due primarily to the lack of suitable thorough RE activity databases and methods to explore them. A typical intersectional method acts like an AND logic gate by converting the input of two or more active REs into a single synthetic output, which becomes unique for that cell. Here, we systematically assessed the intersectional genetics landscape of the human genome using a curated subset of cells from a large RE usage atlas obtained by Cap Analysis of Gene Expression sequencing (CAGE-seq) of thousands of primary and cancer cells (the FANTOM5 consortium atlas). We developed the heuristics and algorithms to retrieve AND gate intersections and quality-rank them intra- and inter-individually. We find that >90% of the 154 primary cell types surveyed can be distinguished from each other with as little as 3 to 4 active REs, with quantifiable safety and robustness. We call these minimal intersections of active REs with cell-type diagnostic potential "Versatile Entry Codes" (VEnCodes). Each of the 158 cancer cell types surveyed could also be distinguished from the healthy primary cell types with small VEnCodes, most of which were highly robust to intra- and inter-individual variation. Finally, we provide methods for the cross-validation of CAGE-seq-derived VEnCodes and for the extraction of VEnCodes from pooled single-cell sequencing data. Our work provides a systematic view of the intersectional genetics landscape in the human genome and demonstrates the potential of these approaches for future gene delivery technologies in humans.
提供机构:
GigaScience Database
创建时间:
2020-07-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作