five

Lemna minor annotation R package (org.Lminor.eg.db) and corresponding files behind the custom built

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/5772210
下载链接
链接失效反馈
官方服务:
资源简介:
This public repository containing the following files: Custom built annotation R package for the Lemna minor reference genome [ org.Lminor.eg.db.7z ]. The package was built via AnnotationForge using sequence homology of protein coding genes for functional characterisation (Description, PFAMs, GO terms). A combined approach using blastx and EMBL's eggNOG mapper was used for this task. This package is compatible with clusterProfiler for downstream functional enrichment analysis (ORA / GSEA) of L. minor transcriptomic / proteomic data. For how to install and use this package in your R session, check the R code example below.   Reference genome, genome annotation (gtf), gene coding sequences (cds) and cds translated peptide sequences (cds.pep) of the duckweed Lemna minor [ Lminor_refGenome_GTF_CDS.7z ]. The reference genome assembly fasta was downloaded from www.lemna.org. Matching GTF annotation file was generated via 'gffread', from the GFF annotation file available here.   With the cds translated peptide file, a blastp search was performed against a custom plant protein sequence database [ Lminor_ref.org.Db4blastp.7z ]. The custom database was built from the proteomes of well annotated reference plant species. (For details refer to the readme file within the compressed folder) For more details please refer to our publication in XXX DOI: XXX # 1. Download and unzip (7zip format) the org.Lminor.eg.db package. # 2. Install the package via: orgDb = "path/to/org.Lminor.eg.db/" install.packages(orgDb, type="source", repos=NULL) # 3. Restart R session then load package require(org.Lminor.eg.db) require(AnnotationDbi) # to check for columns and keytypes: columns(org.Lminor.eg.db) keytypes(org.Lminor.eg.db) # query the org.Lminor.eg.db for particular Lemna gene IDs (GID) gid = keys(org.Lminor.eg.db, keytype="GID") col = columns(org.Lminor.eg.db)[c(5,17,9,15,1,8,14)] df = select(org.Lminor.eg.db, keys=gid[1000:1100], columns=col, keytype="GID") View(df) ### Running overrepresenation analysis in clusterProfiler using the Lminor annotation package ### # ORA for multiple gene sets via compareCluster() require(clusterProfiler) genLs = list(setA = gid[1:40], setB = gid[100:140], setC = gid[1000:1040]) res = compareCluster(genLs, fun = "enrichGO", OrgDb = "org.Lminor.eg.db", keyType = "GID", ont = "BP", universe = gid) # Compute semantic similiarities among GO terms: d = GOSemSim::godata('org.Lminor.eg.db', ont="BP", computeIC=FALSE, keytype = "GID") res = enrichplot::pairwise_termsim(res, method = "Wang", semData = d) # Rmv GO terms with redudant biological information resS = simplify(res, .8) # resort results after pvalues resS@compareClusterResult = resS@compareClusterResult[order(resS@compareClusterResult$pvalue),] View(res@compareClusterResult) # Network plot emapplot(resS, showCategory = 30)
创建时间:
2023-05-29
二维码
社区交流群
二维码
科研交流群
商业服务