Supplemental File 4
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Supplemental_File_S4/24139407
下载链接
链接失效反馈官方服务:
资源简介:
Sequences removed during curation. These sequences are found in the pre-curation sequence and tree data (Supplementary files 8-9) but were removed during curation, as described in the methods section. Under the “reason removed” columns, “Low coverage compositional outlier” indicates sequences with a significantly outlying Malahonobis distance in the GC3S versus ENc distribution and with k-mer coverage <10 or with a more highly covered paralogous sequence in the same clade; “Identical pair with differential coverage” indicates sequences at least 95% identical at the nucleotide level over at least 67% of their length with a sequence from another taxon that is more highly covered, as long as the k-mer coverage of the more lowly covered sequence is below a taxon-dependent threshold as determined by manual inspection of the data (50 for Bolivina and Nonionella, 20 for Hippocrepinella hirudina and Psammophaga fuegia, and 100 for Ammodiscus and sample Mf03 (Milliammina), and other wise 10); “Lowly covered paralog” indicates sequences with a more highly covered paralog in the same clade that covers at least 80% of the alignment length for the clade; “Sister genus and lab of origin” indicates sequences from taxa for which we have more than one sample available directly sister in the rebuilt clade tree to a taxon of a different genus (and not sister to any taxa of the same genus), as long as all the sister samples originate from the same lab as the sequence to be removed.
创建时间:
2023-09-14



