five

Gene_annotation.zip

收藏
Figshare2020-04-03 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/Gene_annotation_zip/12073140/1
下载链接
链接失效反馈
官方服务:
资源简介:
# Codes used for Gene annotation<br> Steps Dependency: Java<br>1. Ensembl synonyms to gene symbol```javac step1_ensembl_synonyms_to_symbol.javajava step1_ensembl_synonyms_to_symbol```<br> Input: external_synonym.txt #downloaded from ensembl-annotation Input: xref.txt #Downloaded from The Ensembl Xref System Output: ensembl_synonymous_to_symbol.txt<br><br>2. Genes annotated by COSMIC annotation```javac step2_gene_merge.javajava step2_gene_merge```Input: H3K4me3_width.all_in_one.xlsInput: TUSON_all_genelist.txtOutput: compiled_genelist.txt #(Gene list compiled from TUSON and Kaifu Chen's Paper)<br>3. Gene name list generation```javac step3_gene_name_generation.javajava step3_gene_name_generation``` Input: Gene_set_new.txt<br> Input: TUSON_input_new_sorted.txt # From COSMIC gene list Output: compiled_genelist_genename.txt # (Gene symbol list consistent with Gene_set_new.txt) Output: gene_set_sorted.txt #(Gene symbol list consistent with mutation with same sequence)<br>4. Gene promoter and genebody BED file generation```javac Step4_gene_promoter_genebody_BED_generation.javajava Step4_gene_promoter_genebody_BED_generation``` Input: compiled_genelist_genename.txt #(Gene symbol list consistent with Gene_set_new.txt) Input: ensembl_synonymous_to_symbol.txt #(Gene alias conversion to HGNC gene names) Input: gencode.v25lift37.annotation.gtf #(GENCODE annotation) Input: gencode.v19.annotation.gtf #(GENCODE annotation)<br> Output: compiled_genelist_promoter_hg19.bed #(Gene promoters defined as [-500, 1000] around TSSs)<br> Output: compiled_genelist_promoter_250_hg19.bed #(Gene promoters defined as [-500, 250] around TSSs)<br> Output: compiled_genelist_genebody_hg19.bed #(Gene promoters defined as 500 downstream of TSSs towards to TTSs)<br><br>5. Gene full-length BED file generation```javac Step5_gene_fulllength_BED_generation.javajava Step5_gene_fulllength_BED_generation```<br> Input: Gene_set_new.txt #(Gene annotation information used by this study) Input: compiled_genelist_genename.txt #(Gene symbol list consistent with Gene_set_new.txt) Input: ensembl_synonymous_to_symbol.txt #(Gene alias conversion to HGNC gene names) Input: gencode.v25lift37.annotation.gtf #(Gencode v25 hg19 GTF annotation) Input: gencode.v19.annotation.gtf #(Gencode v19 hg19 GTF annotation)<br> Output: compiled_gene_pos_hg19.bed<br>6. Gene name and alias mapping file generation```javac Step6_COSMIC_to_Gene_Symbol.javajava Step6_COSMIC_to_Gene_Symbol``` Input: compiled_genelist_genename.txt #(Gene symbol list consistent with Gene_set_new.txt) Input: CosmicTranscripts.tsv #(Transcript anntation file downloaded from COSMIC website) Input: Ensembl_TransID_v87_GeneSymbol.txt #(Annotation file: Gene Symbol to Ensembl IDs for some outdated genes) Input: Ensembl_TransID_v92_GeneSymbol.txt #(Annotation file: Gene Symbol to Ensembl IDs for some outdated genes) Output: Cosmic2Gene_name.txt #(Gene name and alias mapping file)
创建时间:
2020-04-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作