Excel file of data on repeat-detection in three herpesvirus species.
收藏DataCite Commons2026-05-04 更新2026-05-05 收录
下载链接:
https://scholarsphere.psu.edu/resources/680aa281-5aef-4dcb-bd69-81061c21ccde
下载链接
链接失效反馈官方服务:
资源简介:
These data relate to Figure 2 of the manuscript entitled: "Best practices for herpesvirus genomics."
The excel file has the underlying analyses of repeats in the reference genome for each virus family: HSV1 strain 17, NC_001806; HCMV strain Merlin, NC_006273; KSHV strain GK18, NC_009333; VZV strain Dumas, NC_001348. These data span 16 tabs in the excel file, and they are named by each reference genome and the relevant analysis-method: MISA, TRF, or GenBank.
MISA = the MIcroSAtellite identification tool (MISA), currently found online here: https://webblast.ipk-gatersleben.de/misa/. TRF = Tandem Repeat Finder, currently found online here: https://tandem.bu.edu/trf/home. The "GenBank_repeats" tab contains the repeats that are currently annotated in each GenBank accession file.
MISA was run with these parameters (unit × minimum copy #): 1 bp × 6, 2 bp × 4, 3 bp × 4, and 4-6 bp × 3 copies.
Tandem Repeat Finder (TRF, (229) was run with default parameters (i.e., "basic" search: alignment (match, mismatch, indels) 2, 7, 7; match probability (PM) 80; indel probability (PI) 10; minimum alignment score to report repeat 50; and maximum period size 500.
提供机构:
scholarsphere
创建时间:
2026-05-04



