five

Excel file of data on repeat-detection in three herpesvirus species.

收藏
DataCite Commons2026-05-04 更新2026-05-05 收录
下载链接:
https://scholarsphere.psu.edu/resources/680aa281-5aef-4dcb-bd69-81061c21ccde
下载链接
链接失效反馈
官方服务:
资源简介:
These data relate to Figure 2 of the manuscript entitled: "Best practices for herpesvirus genomics." The excel file has the underlying analyses of repeats in the reference genome for each virus family: HSV1 strain 17, NC_001806; HCMV strain Merlin, NC_006273; KSHV strain GK18, NC_009333; VZV strain Dumas, NC_001348. These data span 16 tabs in the excel file, and they are named by each reference genome and the relevant analysis-method: MISA, TRF, or GenBank. MISA = the MIcroSAtellite identification tool (MISA), currently found online here: https://webblast.ipk-gatersleben.de/misa/. TRF = Tandem Repeat Finder, currently found online here: https://tandem.bu.edu/trf/home. The "GenBank_repeats" tab contains the repeats that are currently annotated in each GenBank accession file. MISA was run with these parameters (unit × minimum copy #): 1 bp × 6, 2 bp × 4, 3 bp × 4, and 4-6 bp × 3 copies. Tandem Repeat Finder (TRF, (229) was run with default parameters (i.e., "basic" search: alignment (match, mismatch, indels) 2, 7, 7; match probability (PM) 80; indel probability (PI) 10; minimum alignment score to report repeat 50; and maximum period size 500.
提供机构:
scholarsphere
创建时间:
2026-05-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作