A domain classification with an ensemble of HMMs.
收藏Figshare2015-12-02 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/_A_domain_classification_with_an_ensemble_of_HMMs_/685769
下载链接
链接失效反馈官方服务:
资源简介:
The first column lists the different substrates and the number of sequences analyzed (between brackets). The second column lists the number of correctly classified sequences by our ensemble of HMMs, for the non-redundant reference dataset of experimentally validated substrate specific A domain sequences collected from reference databases, literature and from [43] (set K = 571 sequences). The third column gives the number of sequences that received a false annotation (f), and the fourth column gives the number of sequences that scored above treshold (at, grey and numbers in italics). Columns five, six and seven provide the same information but then related to the Leave One Out cross validation.$ See the legend of Figure 3 for the systematic name of the various substrates. The category ‘other’ includes those substrates that are represented only once in the domain sequence dataset. They include: 2-oxo-isovaleric-acid, 3-methyl-glutamate (3-me-glu), 4-propyl-proline (4ppro), 2-amino-9,10-epoxy-8-oxodecanoic acid (aeo), alaninol, alle, alpha-hydroxy-isocaproic acid, an, (S)-2-amino-8-oxodecanoic acid (aoda), l-capreomycidine (cap), d-lysergic acid (d-lyserg), hydroxyl-asn, hmp-D, LDAP, MeHOval, N-methyl-phenylalanine (mephe), N-methyl valine (meval), N-(1,1-dimethyl-1-allyl)tryptophan, phenyl-glycine (phg), s-nmethoxy-tryptophan, (4S)-5,5,5-trichloro-leucine (tcl), valinol (vol).*, # and &: For particular substrates no representative models were present in one or more of the predictors that were compared in Table 3 (*, NRPSsp; #, NRPS predictor 2; &, ensemble HMMs). Ideally the related sequences should obtain a score above threshold.
创建时间:
2015-12-02



