five

maomlab/CryptoCEN

收藏
Hugging Face2024-01-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/maomlab/CryptoCEN
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - tabular-regression tags: - biology pretty_name: Cryptococcus Coexpression Network size_categories: - 10M<n<100M --- # CryptoCEN: A Co-expression network for *Cryptococcus neoformans* Elucidating gene function is a major goal in biology, especially among non-model organisms. However, doing so is complicated by the fact that molecular conservation does not always mirror functional conservation, and that complex relationships among genes are responsible for encoding pathways and higher-order biological processes. Co-expression, a promising approach for predicting gene function, relies on the general principal that genes with similar expression patterns across multiple conditions will likely be involved in the same biological process. For Cryptococcus neoformans, a prevalent human fungal pathogen greatly diverged from model yeasts, approximately 60% of the predicted genes in the genome lack functional annotations. Here, we leveraged a large amount of publicly available transcriptomic data to generate a C. neoformans Co-Expression Network (CryptoCEN), successfully recapitulating known protein networks, predicting gene function, and enabling insights into the principles influencing co-expression. With 100% predictive accuracy, we used CryptoCEN to identify 13 new DNA damage response genes, underscoring the utility of guilt-by-association for determining gene function. Overall, co-expression is a powerful tool for uncovering gene function, and decreases the experimental tests needed to identify functions for currently under-annotated genes. MJ O'Meara, JR Rapala, CB Nichols, C Alexandre, B Billmyre, JL Steenwyk, A Alspaugh, TR O'Meara CryptoCEN: A Co-Expression Network for Cryptococcus neoformans reveals novel proteins involved in DNA damage repair Code available at https://github.com/maomlab/CalCEN/tree/master/vignettes/CryptoCEN **h99_transcript_annotations.tsv** * Cryptococcus neoforman H99 (NCBI Taxon:235443) annotated protein features collected from FungiDB Release 49 **top_coexp_hits.tsv** * top 50 CrypoCEN associations for each gene **top_coexp_hits_0.05.tsv** * top CrypoCEN associations for each gene filtered by score > 0.95 and at most 50 per gene **Data/estimated_expression_meta.tsv** * Metadata for RNAseq estimated expression runs **Data/estimated_expression.tsv** * gene by RNA-seq run estimated expression **Data/sac_complex_interactions.tsv** * C. neoformans genes that are orthologous to S. cerevisiae genes who's proteins are involved in a protein complex **Networks/CryptoCEN_network.tsv** * Co-expression network **Networks/BlastP_network.tsv** * Protein sequence similarity network **Network/CoEvo_network.tsv** * Co-evolution network
提供机构:
maomlab
原始信息汇总

CryptoCEN: A Co-expression network for Cryptococcus neoformans

数据集概述

CryptoCEN是一个用于Cryptococcus neoformans的共表达网络,旨在通过分析大量公开可用的转录组数据来揭示基因功能,特别是对于非模式生物。该网络成功地重现了已知的蛋白质网络,预测了基因功能,并提供了对影响共表达原理的洞察。

数据集文件

  • h99_transcript_annotations.tsv: Cryptococcus neoforman H99(NCBI Taxon:235443)从FungiDB Release 49收集的注释蛋白质特征。
  • top_coexp_hits.tsv: 每个基因的前50个CrypoCEN关联。
  • top_coexp_hits_0.05.tsv: 每个基因的CrypoCEN关联,过滤条件为分数>0.95且每个基因最多50个。
  • Data/estimated_expression_meta.tsv: RNAseq估计表达运行的元数据。
  • Data/estimated_expression.tsv: 基因按RNA-seq运行的估计表达。
  • Data/sac_complex_interactions.tsv: 与S. cerevisiae中涉及蛋白质复合物的基因同源的C. neoformans基因。
  • Networks/CryptoCEN_network.tsv: 共表达网络。
  • Networks/BlastP_network.tsv: 蛋白质序列相似性网络。
  • Network/CoEvo_network.tsv: 共进化网络。

数据集标签

  • 生物学
  • 表格回归

数据集大小

  • 10M<n<100M

许可证

  • MIT
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作