codoncounts.zip
收藏DataCite Commons2025-05-01 更新2024-08-17 收录
下载链接:
https://figshare.com/articles/codoncounts_zip/7599020/1
下载链接
链接失效反馈官方服务:
资源简介:
This directory contains codon counts analysed by genomegaMap in D. J. Wilson and The CRyPTIC Consortium (2019). The codon counts for a particular coding sequence in the Mycobacterium tuberculosis H37Rv reference genome (version 2, genbank accession number NC_000962.2) are contained in each codoncounts.txt file, where the filename is prefixed by the gene identifier. Each file contains a matrix of integers with no row or column names, where element (row i, column j) of the matrix records the number of genomes exhibiting triplet j at codon position i. Positions are ordered as per NC_000962.2, beginning with the start codon. Terminal stop codons are not included. Triplets are ordered as follows:TTT,TTC,TTA,TTG,TCT,TCC,TCA,TCG,TAT,TAC,TGT,TGC,TGG,CTT,CTC,CTA,CTG,CCT,CCC,CCA,CCG,CAT,CAC,CAA,CAG,CGT,CGC,CGA,CGG,ATT,ATC,ATA,ATG,ACT,ACC,ACA,ACG,AAT,AAC,AAA,AAG,AGT,AGC,AGA,AGG,GTT,GTC,GTA,GTG,GCT,GCC,GCA,GCG,GAT,GAC,GAA,GAG,GGT,GGC,GGA,GGG,---where --- represents any call other than the 61 non-stop codons, including deletions, ambiguous or filtered calls, and premature stop codons. 10,209 genomes were mapped against the H37Rv reference, with details and short read archive accession numbers described in the original paper by The CRyPTIC Consortium and the 100,000 Genomes Project (2018).<br>
本目录包含由D. J. Wilson与CRyPTIC联盟(CRyPTIC Consortium)于2019年通过genomegaMap分析得到的密码子计数数据集。结核分枝杆菌H37Rv(Mycobacterium tuberculosis H37Rv)参考基因组(版本2,GenBank登录号NC_000962.2)中特定编码序列的密码子计数,均存储于每个codoncounts.txt文件中,文件名以对应基因的标识符作为前缀。每个文件均包含无行名与列名的整数矩阵,矩阵的第i行第j列元素,记录了在密码子位置i处使用三联体j的基因组数量。密码子位置按照NC_000962.2的序列顺序排列,起始于起始密码子,且不包含末端终止密码子。三联体的排列顺序如下:TTT、TTC、TTA、TTG、TCT、TCC、TCA、TCG、TAT、TAC、TGT、TGC、TGG、CTT、CTC、CTA、CTG、CCT、CCC、CCA、CCG、CAT、CAC、CAA、CAG、CGT、CGC、CGA、CGG、ATT、ATC、ATA、ATG、ACT、ACC、ACA、ACG、AAT、AAC、AAA、AAG、AGT、AGC、AGA、AGG、GTT、GTC、GTA、GTG、GCT、GCC、GCA、GCG、GAT、GAC、GAA、GAG、GGT、GGC、GGA、GGG、---,其中---代表61种非终止密码子之外的所有调用结果,包括序列缺失、模糊碱基调用或经过过滤的调用,以及提前出现的终止密码子。本研究共将10209个基因组比对至H37Rv参考基因组,相关研究细节与短读长存档(Short Read Archive)登录号可参见CRyPTIC联盟与100,000基因组计划于2018年发表的原始研究论文。
提供机构:
figshare
创建时间:
2019-01-17



