maomlab/CryptoCEN
收藏Hugging Face2024-01-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/maomlab/CryptoCEN
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- tabular-regression
tags:
- biology
pretty_name: Cryptococcus Coexpression Network
size_categories:
- 10M<n<100M
---
# CryptoCEN: A Co-expression network for *Cryptococcus neoformans*
Elucidating gene function is a major goal in biology, especially among non-model organisms.
However, doing so is complicated by the fact that molecular conservation does not always
mirror functional conservation, and that complex relationships among genes are responsible
for encoding pathways and higher-order biological processes. Co-expression, a promising
approach for predicting gene function, relies on the general principal that genes with
similar expression patterns across multiple conditions will likely be involved in the
same biological process. For Cryptococcus neoformans, a prevalent human fungal pathogen
greatly diverged from model yeasts, approximately 60% of the predicted genes in the genome
lack functional annotations. Here, we leveraged a large amount of publicly available
transcriptomic data to generate a C. neoformans Co-Expression Network (CryptoCEN),
successfully recapitulating known protein networks, predicting gene function, and
enabling insights into the principles influencing co-expression. With 100% predictive
accuracy, we used CryptoCEN to identify 13 new DNA damage response genes, underscoring
the utility of guilt-by-association for determining gene function. Overall, co-expression
is a powerful tool for uncovering gene function, and decreases the experimental tests
needed to identify functions for currently under-annotated genes.
MJ O'Meara, JR Rapala, CB Nichols, C Alexandre, B Billmyre, JL Steenwyk, A Alspaugh, TR O'Meara
CryptoCEN: A Co-Expression Network for Cryptococcus neoformans reveals novel proteins involved in DNA damage repair
Code available at https://github.com/maomlab/CalCEN/tree/master/vignettes/CryptoCEN
**h99_transcript_annotations.tsv**
* Cryptococcus neoforman H99 (NCBI Taxon:235443) annotated protein features collected from FungiDB Release 49
**top_coexp_hits.tsv**
* top 50 CrypoCEN associations for each gene
**top_coexp_hits_0.05.tsv**
* top CrypoCEN associations for each gene filtered by score > 0.95 and at most 50 per gene
**Data/estimated_expression_meta.tsv**
* Metadata for RNAseq estimated expression runs
**Data/estimated_expression.tsv**
* gene by RNA-seq run estimated expression
**Data/sac_complex_interactions.tsv**
* C. neoformans genes that are orthologous to S. cerevisiae genes who's proteins are involved in a protein complex
**Networks/CryptoCEN_network.tsv**
* Co-expression network
**Networks/BlastP_network.tsv**
* Protein sequence similarity network
**Network/CoEvo_network.tsv**
* Co-evolution network
提供机构:
maomlab
原始信息汇总
CryptoCEN: A Co-expression network for Cryptococcus neoformans
数据集概述
CryptoCEN是一个用于Cryptococcus neoformans的共表达网络,旨在通过分析大量公开可用的转录组数据来揭示基因功能,特别是对于非模式生物。该网络成功地重现了已知的蛋白质网络,预测了基因功能,并提供了对影响共表达原理的洞察。
数据集文件
- h99_transcript_annotations.tsv: Cryptococcus neoforman H99(NCBI Taxon:235443)从FungiDB Release 49收集的注释蛋白质特征。
- top_coexp_hits.tsv: 每个基因的前50个CrypoCEN关联。
- top_coexp_hits_0.05.tsv: 每个基因的CrypoCEN关联,过滤条件为分数>0.95且每个基因最多50个。
- Data/estimated_expression_meta.tsv: RNAseq估计表达运行的元数据。
- Data/estimated_expression.tsv: 基因按RNA-seq运行的估计表达。
- Data/sac_complex_interactions.tsv: 与S. cerevisiae中涉及蛋白质复合物的基因同源的C. neoformans基因。
- Networks/CryptoCEN_network.tsv: 共表达网络。
- Networks/BlastP_network.tsv: 蛋白质序列相似性网络。
- Network/CoEvo_network.tsv: 共进化网络。
数据集标签
- 生物学
- 表格回归
数据集大小
- 10M<n<100M
许可证
- MIT



